For the past 2 years, I am a happy user of rxvt-unicode, aka urxvt. Recently, I noticed that my logs contained rather mysterious error messages:
Nov 3 22:46:03 meili utmp_update[1613]: [ID 845426 user.error] Wrong number of
arguments or invalid user
Sometimes, there were a dozen of these. Of course I filed a bug with the Illumos folks. Rich Lowe suggested using DTrace to figure out what is actually going on. It was time to look at the exit codes for utmp_update.
syscall::rexit:entry
/execname=="utmp_update"/
{
printf("utmp_update exited with code %d", arg0);
@[arg0] = count();
}
tick-60sec
{
printa(@);
}
Since utmp is involved, it had something to do with terminals, so I tried to open some terminals and close them. That did it!
# dtrace -s catch-errors.d
dtrace: script 'catch-errors.d' matched 2 probes
CPU ID FUNCTION:NAME
0 49 rexit:entry utmp_update exited with code 0
1 49 rexit:entry utmp_update exited with code 0
0 49 rexit:entry utmp_update exited with code 7
0 49 rexit:entry utmp_update exited with code 7
1 49 rexit:entry utmp_update exited with code 0
5 49 rexit:entry utmp_update exited with code 0
1 49 rexit:entry utmp_update exited with code 0
2 49 rexit:entry utmp_update exited with code 0
3 49 rexit:entry utmp_update exited with code 7
1 67549 :tick-60sec
7 3
0 6
It turns out that every time I closed a terminal, utmp_update exited with error 7. A quick glance at usr/src/cmd/utmp_update/utmp_update.c reveals:
/*
* Return codes
*/
#define NORMAL_EXIT 0
#define BAD_ARGS 1
#define PUTUTXLINE_FAILURE 2
#define FORK_FAILURE 3
#define SETSID_FAILURE 4
#define ALREADY_DEAD 5
#define ENTRY_NOTFOUND 6
#define ILLEGAL_ARGUMENT 7
#define DEVICE_ERROR 8
Aha! It really is an invalid argument. At this point, Rich pointed me to setutxline in libc.so. Sadly, for whatever reason, the probe pid*:libc.so.1:pututxline:entry didn’t work (it didn’t match anything). Rich suggested the following DTrace script:
proc:::exec
/strstr(args[0], "utmp") != NULL/
{
trace(execname);
}
Pretty straightforward — the output told me that it was urxvt causing all this trouble.
Now, I knew to watch out for pututxline in urxvt. I tried to set a probe pid$target::pututxline:entry and use the new print function in DTrace, but due to a user error (read: sometimes I write stupid code) it didn’t work. Rich helped me navigate through mdb to get a print-out of the utx structure. At this point, it was a bit too late in the night and so I went to bed.
The next morning, I tried the print function again and this time I used it right and it printed out the structure:
struct utmpx {
char [32] ut_user = [ "" ]
char [4] ut_id = [ 'v', 't', '0', '2' ]
char [32] ut_line = [ "pts/2" ]
pid_t ut_pid = 0x193ec
short ut_type = 0x8
struct exit_status ut_exit = {
short e_termination = 0
short e_exit = 0
}
struct timeval ut_tv = {
time_t tv_sec = 0x4eb55d2a
suseconds_t tv_usec = 0x18adc
}
int ut_session = 0
int [5] pad = [ 0, 0, 0, 0, 0 ]
short ut_syslen = 0
char [257] ut_host = [ "" ]
}
Everything looks right, except that the ut_user field is blank. I wonder if this could be the cause of it. Time to look at the urxvt code! (The ustack() action in a DTrace probe for pututxline:entry will tell you where to look.) Here’s a snippet from rxvt-unicode-9.12/libptytty/src/logging.C:
/*
* remove utmp and wtmp entries
*/
void
ptytty_unix::logout ()
{
...
#ifdef HAVE_STRUCT_UTMPX
setutxent ();
tmputx = getutxid (utx);
if (tmputx && tmputx->ut_pid == cmd_pid)
pututxline (utx);
endutxent ();
#endif
...
}
Ok, so it gets a utx struct and then it puts a different one. Let’s see how different those two are:
# cat getutxid.d
#include <utmpx.h>
pid$target::getutxid:return
{
ustack();
print(*(struct utmpx*)copyin(arg1, sizeof(struct utmpx)));
}
# dtrace -Cs getutxid.d -p 103403
dtrace: script 'getutxid.d' matched 1 probes
dtrace: pid 103403 has exited
CPU ID FUNCTION:NAME
4 67555 getutxid:return
libc.so.1`getutxid+0xf1
urxvt`_ZN11ptytty_unixD0Ev+0x16
urxvt`_ZN9rxvt_termD1Ev+0x59b
urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
urxvt`ev_invoke_pending+0x35
urxvt`ev_run+0x520
urxvt`main+0x29b
urxvt`_start+0x83
struct utmpx {
char [32] ut_user = [ "jeffpc" ]
char [4] ut_id = [ 'v', 't', '0', '2' ]
char [32] ut_line = [ "pts/2" ]
pid_t ut_pid = 0x193ec
short ut_type = 0x7
struct exit_status ut_exit = {
short e_termination = 0
short e_exit = 0x2
}
struct timeval ut_tv = {
time_t tv_sec = 0x4eb55d1f
suseconds_t tv_usec = 0x18adc
}
int ut_session = 0
int [5] pad = [ 0, 0, 0, 0x303a0005, 0x302e ]
short ut_syslen = 0
char [257] ut_host = [ "" ]
}
Ok, it’s more or less the same. It does, however, have a username filled in. I wonder what would happen if I filled in the username. (The urxvt code seems to fill it in only on login updates and it leaves the field empty on logout updates.) Now, I had 3 choices…
- Change the urxvt code to fill in the username on logout updates.
- Set a breakpoint in gdb or mdb and then tweak the structure before it is passed to utmp_update.
- Use DTraces “destructive” option to allow me to modify the process’s memory.
I chose #3.
Here’s the script in all its glory:
#pragma D option destructive
#include <utmpx.h>
pid$target::getutxid:return
{
ustack();
print(*(struct utmpx*)copyin(arg1, sizeof(struct utmpx)));
}
pid$target::pututxline:entry
{
ustack();
print(*(struct utmpx*)copyin(arg0, sizeof(struct utmpx)));
printf("\nFIXING...\n");
copyout("jeffpc\0", (uintptr_t)&((struct utmpx*)arg0)->ut_user[0], 7);
print(*(struct utmpx*)copyin(arg0, sizeof(struct utmpx)));
}
pid$target::pututxline:return
{
printf("pututxline returned %p", arg1);
}
And here’s the output:
# dtrace -Cs foo.d -p 103403
dtrace: script 'foo.d' matched 3 probes
dtrace: allowing destructive actions
dtrace: pid 103403 has exited
CPU ID FUNCTION:NAME
4 67555 getutxid:return
libc.so.1`getutxid+0xf1
urxvt`_ZN11ptytty_unixD0Ev+0x16
urxvt`_ZN9rxvt_termD1Ev+0x59b
urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
urxvt`ev_invoke_pending+0x35
urxvt`ev_run+0x520
urxvt`main+0x29b
urxvt`_start+0x83
struct utmpx {
char [32] ut_user = [ "jeffpc" ]
char [4] ut_id = [ 'v', 't', '0', '2' ]
char [32] ut_line = [ "pts/2" ]
pid_t ut_pid = 0x193ec
short ut_type = 0x7
struct exit_status ut_exit = {
short e_termination = 0
short e_exit = 0x2
}
struct timeval ut_tv = {
time_t tv_sec = 0x4eb55d1f
suseconds_t tv_usec = 0x18adc
}
int ut_session = 0
int [5] pad = [ 0, 0, 0, 0x303a0005, 0x302e ]
short ut_syslen = 0
char [257] ut_host = [ "" ]
}
4 67556 pututxline:entry
libc.so.1`pututxline
urxvt`_ZN11ptytty_unix6logoutEv+0x15c
urxvt`_ZN11ptytty_unixD0Ev+0x16
urxvt`_ZN9rxvt_termD1Ev+0x59b
urxvt`_ZN9rxvt_term10destroy_cbERN2ev4idleEi+0x70
urxvt`_ZN2ev4baseI7ev_idleNS_4idleEE12method_thunkI9rxvt_termXadL_ZNS5_10destroy_cbERS2_iEEEEvPS1_i+0x27
urxvt`ev_invoke_pending+0x35
urxvt`ev_run+0x520
urxvt`main+0x29b
urxvt`_start+0x83
struct utmpx {
char [32] ut_user = [ "" ]
char [4] ut_id = [ 'v', 't', '0', '2' ]
char [32] ut_line = [ "pts/2" ]
pid_t ut_pid = 0x193ec
short ut_type = 0x8
struct exit_status ut_exit = {
short e_termination = 0
short e_exit = 0
}
struct timeval ut_tv = {
time_t tv_sec = 0x4eb55d2a
suseconds_t tv_usec = 0x18adc
}
int ut_session = 0
int [5] pad = [ 0, 0, 0, 0, 0 ]
short ut_syslen = 0
char [257] ut_host = [ "" ]
}
FIXING...
struct utmpx {
char [32] ut_user = [ "jeffpc" ]
char [4] ut_id = [ 'v', 't', '0', '2' ]
char [32] ut_line = [ "pts/2" ]
pid_t ut_pid = 0x193ec
short ut_type = 0x8
struct exit_status ut_exit = {
short e_termination = 0
short e_exit = 0
}
struct timeval ut_tv = {
time_t tv_sec = 0x4eb55d2a
suseconds_t tv_usec = 0x18adc
}
int ut_session = 0
int [5] pad = [ 0, 0, 0, 0, 0 ]
short ut_syslen = 0
char [257] ut_host = [ "" ]
}
4 67557 pututxline:return pututxline returned fef6ecf0
We can see the getutxid return a reasonable utmpx. Then we see pututxline get a utmpx without a username set. Then there is the fixed up tmpx. Finally, we see that the pututxline returned a non-NULL pointer. (It returns NULL on error — which does indeed happen without the fix-up.)
There you have it, folks. DTrace let me debug an issue without annoying change-compile-install cycles. Now, all I have to do is fix up urxvt in OpenIndiana and possibly, if it applies to other systems, push the fix upstream.