Josef “Jeff” Sipek

DTrace: The utmp_update Debugger

For the past 2 years, I am a happy user of rxvt-unicode, aka urxvt. Recently, I noticed that my logs contained rather mysterious error messages:

Nov  3 22:46:03 meili utmp_update[1613]: [ID 845426 user.error] Wrong number of
	arguments or invalid user 

Sometimes, there were a dozen of these. Of course I filed a bug with the Illumos folks. Rich Lowe suggested using DTrace to figure out what is actually going on. It was time to look at the exit codes for utmp_update.

syscall::rexit:entry
/execname=="utmp_update"/
{
	printf("utmp_update exited with code %d", arg0);
	@[arg0] = count();
}

tick-60sec
{
	printa(@);
}

Since utmp is involved, it had something to do with terminals, so I tried to open some terminals and close them. That did it!

# dtrace -s catch-errors.d 
dtrace: script 'catch-errors.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0     49                      rexit:entry utmp_update exited with code 0
  1     49                      rexit:entry utmp_update exited with code 0
  0     49                      rexit:entry utmp_update exited with code 7
  0     49                      rexit:entry utmp_update exited with code 7
  1     49                      rexit:entry utmp_update exited with code 0
  5     49                      rexit:entry utmp_update exited with code 0
  1     49                      rexit:entry utmp_update exited with code 0
  2     49                      rexit:entry utmp_update exited with code 0
  3     49                      rexit:entry utmp_update exited with code 7
  1  67549                      :tick-60sec 
                7                3
                0                6

It turns out that every time I closed a terminal, utmp_update exited with error 7. A quick glance at usr/src/cmd/utmp_update/utmp_update.c reveals:

/*
 * Return codes
 */
#define	NORMAL_EXIT		0
#define	BAD_ARGS		1
#define	PUTUTXLINE_FAILURE	2
#define	FORK_FAILURE		3
#define	SETSID_FAILURE		4
#define	ALREADY_DEAD		5
#define	ENTRY_NOTFOUND		6
#define	ILLEGAL_ARGUMENT	7
#define	DEVICE_ERROR		8

Aha! It really is an invalid argument. At this point, Rich pointed me to setutxline in libc.so. Sadly, for whatever reason, the probe pid*:libc.so.1:pututxline:entry didn’t work (it didn’t match anything). Rich suggested the following DTrace script:

proc:::exec
/strstr(args[0], "utmp") != NULL/
{
	trace(execname);
}

Pretty straightforward — the output told me that it was urxvt causing all this trouble.

Now, I knew to watch out for pututxline in urxvt. I tried to set a probe pid$target::pututxline:entry and use the new print function in DTrace, but due to a user error (read: sometimes I write stupid code) it didn’t work. Rich helped me navigate through mdb to get a print-out of the utx structure. At this point, it was a bit too late in the night and so I went to bed.

The next morning, I tried the print function again and this time I used it right and it printed out the structure:

struct utmpx {
    char [32] ut_user = [ "" ]
    char [4] ut_id = [ 'v', 't', '0', '2' ]
    char [32] ut_line = [ "pts/2" ]
    pid_t ut_pid = 0x193ec
    short ut_type = 0x8
    struct exit_status ut_exit = {
        short e_termination = 0
        short e_exit = 0
    }
    struct timeval ut_tv = {
        time_t tv_sec = 0x4eb55d2a
        suseconds_t tv_usec = 0x18adc
    }
    int ut_session = 0
    int [5] pad = [ 0, 0, 0, 0, 0 ]
    short ut_syslen = 0
    char [257] ut_host = [ "" ]
}

Everything looks right, except that the ut_user field is blank. I wonder if this could be the cause of it. Time to look at the urxvt code! (The ustack() action in a DTrace probe for pututxline:entry will tell you where to look.) Here’s a snippet from rxvt-unicode-9.12/libptytty/src/logging.C:

/*
 * remove utmp and wtmp entries
 */
void
ptytty_unix::logout ()
{
  ...
#ifdef HAVE_STRUCT_UTMPX
  setutxent ();
  tmputx = getutxid (utx);
  if (tmputx && tmputx->ut_pid == cmd_pid)
    pututxline (utx);
  endutxent ();
#endif
  ...
}

Ok, so it gets a utx struct and then it puts a different one. Let’s see how different those two are:

# cat getutxid.d
#include <utmpx.h>

pid$target::getutxid:return
{
	ustack();
	print(*(struct utmpx*)copyin(arg1, sizeof(struct utmpx)));
}
# dtrace -Cs getutxid.d -p 103403
dtrace: script 'getutxid.d' matched 1 probes
dtrace: pid 103403 has exited
CPU     ID                    FUNCTION:NAME
  4  67555                  getutxid:return 
              libc.so.1`getutxid+0xf1
              urxvt`_ZN11ptytty_unixD0Ev+0x16
              urxvt`_ZN9rxvt_termD1Ev+0x59b
              urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
              urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
              urxvt`ev_invoke_pending+0x35
              urxvt`ev_run+0x520
              urxvt`main+0x29b
              urxvt`_start+0x83
struct utmpx {
    char [32] ut_user = [ "jeffpc" ]
    char [4] ut_id = [ 'v', 't', '0', '2' ]
    char [32] ut_line = [ "pts/2" ]
    pid_t ut_pid = 0x193ec
    short ut_type = 0x7
    struct exit_status ut_exit = {
        short e_termination = 0
        short e_exit = 0x2
    }
    struct timeval ut_tv = {
        time_t tv_sec = 0x4eb55d1f
        suseconds_t tv_usec = 0x18adc
    }
    int ut_session = 0
    int [5] pad = [ 0, 0, 0, 0x303a0005, 0x302e ]
    short ut_syslen = 0
    char [257] ut_host = [ "" ]
}

Ok, it’s more or less the same. It does, however, have a username filled in. I wonder what would happen if I filled in the username. (The urxvt code seems to fill it in only on login updates and it leaves the field empty on logout updates.) Now, I had 3 choices…

  1. Change the urxvt code to fill in the username on logout updates.
  2. Set a breakpoint in gdb or mdb and then tweak the structure before it is passed to utmp_update.
  3. Use DTraces “destructive” option to allow me to modify the process’s memory.

I chose #3.

Here’s the script in all its glory:

#pragma D option destructive

#include <utmpx.h>

pid$target::getutxid:return
{
	ustack();
	print(*(struct utmpx*)copyin(arg1, sizeof(struct utmpx)));
}

pid$target::pututxline:entry
{
	ustack();
	print(*(struct utmpx*)copyin(arg0, sizeof(struct utmpx)));
	printf("\nFIXING...\n");
	copyout("jeffpc\0", (uintptr_t)&((struct utmpx*)arg0)->ut_user[0], 7);
	print(*(struct utmpx*)copyin(arg0, sizeof(struct utmpx)));
}

pid$target::pututxline:return
{
	printf("pututxline returned %p", arg1);
}

And here’s the output:

# dtrace -Cs foo.d -p 103403
dtrace: script 'foo.d' matched 3 probes
dtrace: allowing destructive actions
dtrace: pid 103403 has exited
CPU     ID                    FUNCTION:NAME
  4  67555                  getutxid:return 
              libc.so.1`getutxid+0xf1
              urxvt`_ZN11ptytty_unixD0Ev+0x16
              urxvt`_ZN9rxvt_termD1Ev+0x59b
              urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
              urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
              urxvt`ev_invoke_pending+0x35
              urxvt`ev_run+0x520
              urxvt`main+0x29b
              urxvt`_start+0x83
struct utmpx {
    char [32] ut_user = [ "jeffpc" ]
    char [4] ut_id = [ 'v', 't', '0', '2' ]
    char [32] ut_line = [ "pts/2" ]
    pid_t ut_pid = 0x193ec
    short ut_type = 0x7
    struct exit_status ut_exit = {
        short e_termination = 0
        short e_exit = 0x2
    }
    struct timeval ut_tv = {
        time_t tv_sec = 0x4eb55d1f
        suseconds_t tv_usec = 0x18adc
    }
    int ut_session = 0
    int [5] pad = [ 0, 0, 0, 0x303a0005, 0x302e ]
    short ut_syslen = 0
    char [257] ut_host = [ "" ]
}
  4  67556                 pututxline:entry 
              libc.so.1`pututxline
              urxvt`_ZN11ptytty_unix6logoutEv+0x15c
              urxvt`_ZN11ptytty_unixD0Ev+0x16
              urxvt`_ZN9rxvt_termD1Ev+0x59b
              urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
              urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
              urxvt`ev_invoke_pending+0x35
              urxvt`ev_run+0x520
              urxvt`main+0x29b
              urxvt`_start+0x83
struct utmpx {
    char [32] ut_user = [ "" ]
    char [4] ut_id = [ 'v', 't', '0', '2' ]
    char [32] ut_line = [ "pts/2" ]
    pid_t ut_pid = 0x193ec
    short ut_type = 0x8
    struct exit_status ut_exit = {
        short e_termination = 0
        short e_exit = 0
    }
    struct timeval ut_tv = {
        time_t tv_sec = 0x4eb55d2a
        suseconds_t tv_usec = 0x18adc
    }
    int ut_session = 0
    int [5] pad = [ 0, 0, 0, 0, 0 ]
    short ut_syslen = 0
    char [257] ut_host = [ "" ]
}
FIXING...
struct utmpx {
    char [32] ut_user = [ "jeffpc" ]
    char [4] ut_id = [ 'v', 't', '0', '2' ]
    char [32] ut_line = [ "pts/2" ]
    pid_t ut_pid = 0x193ec
    short ut_type = 0x8
    struct exit_status ut_exit = {
        short e_termination = 0
        short e_exit = 0
    }
    struct timeval ut_tv = {
        time_t tv_sec = 0x4eb55d2a
        suseconds_t tv_usec = 0x18adc
    }
    int ut_session = 0
    int [5] pad = [ 0, 0, 0, 0, 0 ]
    short ut_syslen = 0
    char [257] ut_host = [ "" ]
}
  4  67557                pututxline:return pututxline returned fef6ecf0

We can see the getutxid return a reasonable utmpx. Then we see pututxline get a utmpx without a username set. Then there is the fixed up tmpx. Finally, we see that the pututxline returned a non-NULL pointer. (It returns NULL on error — which does indeed happen without the fix-up.)

There you have it, folks. DTrace let me debug an issue without annoying change-compile-install cycles. Now, all I have to do is fix up urxvt in OpenIndiana and possibly, if it applies to other systems, push the fix upstream.

1 Comments »

  1. G'Day, Great post! This is some wicked DTrace: #include, destructive and the brand-new print() action! Brendan

    Comment by Brendan Gregg — November 6, 2011 @ 03:49

Atom feed for comments on this post.

Leave a comment

Powered by blahgd