Feature #368

xadmin start dead process detailed reason infos

Added by Madars about 5 years ago. Updated over 3 years ago.

Status:ClosedStart date:12/28/2018
Priority:High (Code 3)Due date:
Assignee:-% Done:


Target version:-


Reporting more precise diagnostics from dead process at startup:

exec tmsrv -k 0myWI5nu -i 310 -e /tmp/TM1 -r -- -t10 -l/tmp --  :
    process id=20433 ... Died.

For example, we shall capture the output from "exec" command and somehow provide the error back to master copy of ndrxd (either via return code or via queue). In case of queue, then we need some kind of temporary storage for the dead process reasons, so that wen signalled reason arrives, we could read the exact reason.

Better would be if we provide return code, that would avoid the need for queue and storage.

We could store the reasons in hash by pid. The hash should be housekeeped, so that we zap the reasons after some time (to keep the memory in order)


#1 Updated by Madars about 5 years ago

or we copy last error status to process model. but.. for system this will require to start the queueing admin threads to deliver status to ndrxd... just to note.

#2 Updated by Madars about 5 years ago

  • Priority changed from Normal (Code 4) to High (Code 3)

test028 does not boot for some reason. This would help to explain it:

* ndrxd idle instance started.
exec tmsrv -k nre38Kff1kz -i 1 -e /home/user1/endurox/atmitest/test028_tmq/tmsrv-dom1.log -r -- -t1 -l/home/user1/endurox/atmitest/test028_tmq/RM1 --  :
        process id=23941 ... Started.
exec atmisv28 -k nre38Kff1kz -i 20 -e /home/user1/endurox/atmitest/test028_tmq/atmisv28-dom1.log -r --  :
        process id=23955 ... Started.
exec tmqueue -k nre38Kff1kz -i 100 -e /home/user1/endurox/atmitest/test028_tmq/tmqueue-dom1.log -r -- -m MYSPACE -q ./q.conf -s1 --  :
        process id=23957 ... Died.
Startup finished. 2 processes started.
* Shared resources opened...

#3 Updated by Madars about 5 years ago

ULOG.20190111:23957:20190111:02395117:tmqueue     :ERROR! Filed to read tx file: req_read=696, read=0: Is a directory

#4 Updated by Madars almost 4 years ago

Provided status of startup as:

Provide following new status codes for binaries during xadmin start:
static char *nosuchfile = "No such file or directory";

static char *eaccess = "Access denied";

static char *ebadfile = "Bad executable";

static char *elimits = "Limits exceeded";

static char *stillstarting = "Still starting";

static char *eargslim = "CLI args on env params too long";

static char *eenv= "Environment setup failure";

static char *esys= "System failure";

The status after the fork and bad exec is provided via shared memory.

available from 7.1+

#5 Updated by Madars almost 4 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

#6 Updated by Madars over 3 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF