Skip to content

2.24 如何用GDB调试子进程

https://scz.617.cn/unix/201206011217.txt

Q:

如何用GDB调试子进程?这个问题被问了十几年了,上个世纪就不断地被各种人在各 种论坛上问起。

gdb提供了一条命令,set follow-fork-mode child,本意是fork()之后跟踪进子进 程。但当年这个功能是系统相关的,在x86/Linux、x86/FreeBSD、SPARC/Solaris上 无效,据说有人在HP-UX上用成功过。

hhuu@SMTH提供过一个方案:


1)

在fork()处设置断点,修改fork()的返回值为0,迫使父进程去走子进程的流程。这 种做法并没有真正进入已经存在的子进程空间,但对付简单情形非常有效。

2)

如果能修改源代码,就很好办了,在fork()之后的子进程流程中增加阻塞类代码,比 如sleep()、getchar()等等,然后用gdb attach调试子进程。不过一般问此问题的人 都是无法修改源代码的,说了等于没说。


另有人就此问题写过一段内容:


gdb对调试fork()产生的子进程没有很多支持。一般fork()之后,gdb继续对父进程进 行调试,子进程将不受影响地运行。如果你在子进程流程中设了断点,当断点命中时 子进程将收到SIGTRAP信号,如果子进程没有对这个信号进行处理,缺省行为就是使 子进程终止。


多么悲催的历史,不知现状如何?

A: zz@nsfocus 2012-05

现在x86/Linux上的gdb已经可以正常使用"set follow-fork-mode child"了。


/ * gcc-3.3 -Wall -pipe -O3 -s -o follow-fork-mode-test follow-fork-mode-test.c * gcc-3.3 -Wall -pipe -O0 -ggdb -g -o follow-fork-mode-test follow-fork-mode-test.c /

include

include

int main ( int argc, char * argv[] ) { pid_t pid;

pid = fork();
if ( 0 == pid )
{
    /*
     * 子进程
     */
    printf( "child   = %u\n", getpid() );
    execl( "./none", NULL );
}
else
{
    /*
     * 父进程
     */
    printf( "parent  = %u\n", getpid() );
}
return( 0 );

} / end of main /

$ uname -a Linux debian 3.4.0-scz-20120531 #2 SMP Thu May 31 15:47:42 CST 2012 i686 GNU/Linux

(gdb) show version GNU gdb (GDB) 7.0.1-debian

(gdb) help set follow-fork-mode Set debugger response to a program call of fork or vfork. A fork or vfork creates a new process. follow-fork-mode can be: parent - the original process is debugged after a fork child - the new process is debugged after a fork The unfollowed process will continue to run. By default, the debugger will follow the parent process.

(gdb) show follow-fork-mode Debugger response to a program call of fork or vfork is "parent".

注意,这里只支持child、parent,不支持传说中的ask。

调试父进程(这是缺省行为):


$ gdb ./follow-fork-mode-test (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/follow-fork-mode-test

Catchpoint 1 (forked process 2075), 0xb7fe2424 in __kernel_vsyscall () (gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) ni child = 2075 // 子进程已经独立执行过去了 0xb7fe2424 in __kernel_vsyscall () (gdb) i r eax eax 0x81b 2075 // fork()返回值大于0,现在是父进程 (gdb) i proc process 2072 cmdline = '/tmp/follow-fork-mode-test' cwd = '/tmp' exe = '/tmp/follow-fork-mode-test' (gdb) c Continuing. parent = 2072

Program exited normally. (gdb) q


调试子进程:


$ gdb ./follow-fork-mode-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/follow-fork-mode-test

Catchpoint 1 (forked process 2082), 0xb7fe2424 in __kernel_vsyscall () (gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) ni parent = 2079 // 父进程已经独立执行过去了 [New process 2082] 0xb7fe2425 in __kernel_vsyscall () (gdb) i r eax eax 0x0 0 // fork()返回值等于0,现在是子进程 (gdb) i proc process 2082 cmdline = '/tmp/follow-fork-mode-test' cwd = '/tmp' exe = '/tmp/follow-fork-mode-test' (gdb) c Continuing. child = 2082

Program exited normally. (gdb) q


D: scz@nsfocus 2012-06-01

发现调试follow-fork-mode-test时"catch exec"无效,"catch syscall execve"则 可以断下来。

排查后发现"catch exec"并不等同于"catch syscall execve",后者更底层。如果想 拦截fork()之后那一次exec*(),应该用后者,此时还在原进程空间中,尚未切入新 进程空间。前者断下来的时候,已经切入新进程空间。假设execve()失败,用前者根 本断不下来。


/ * gcc-3.3 -Wall -pipe -O3 -s -o pwd-test pwd-test.c * gcc-3.3 -Wall -pipe -O0 -ggdb -g -o pwd-test pwd-test.c /

include

include

int main ( int argc, char * argv[] ) { pid_t pid;

pid = fork();
if ( 0 == pid )
{
    /*
     * 子进程
     */
    execl( "/bin/pwd", "", NULL );
}
return( 0 );

} / end of main /

$ gdb ./pwd-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/pwd-test

Catchpoint 1 (forked process 2218), 0xb7fe2424 in __kernel_vsyscall () (gdb) catch syscall execve // 对比测试[0] Catchpoint 2 (syscall 'execve' [11]) (gdb) c Continuing. [New process 2218]

Catchpoint 2 (call to syscall 'execve'), 0xb7fe2424 in __kernel_vsyscall () (gdb) bt // 此时还是pwd-test进程

0 0xb7fe2424 in __kernel_vsyscall ()

1 0xb7f19a3f in execve () from /lib/i686/cmov/libc.so.6

2 0xb7fc4ff4 in ?? () from /lib/i686/cmov/libc.so.6

(gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) c Continuing. Executing new program: /bin/pwd

Catchpoint 2 (returned from syscall 'execve'), 0xb7fe3850 in ?? () from /lib/ld-linux.so.2 (gdb) bt // 此时已经是pwd进程

0 0xb7fe3850 in ?? () from /lib/ld-linux.so.2

(gdb) x/i $eip-2 0xb7fe384e: add %al,(%eax) (gdb) c Continuing. /tmp

Program exited normally. (gdb) q


$ gdb ./pwd-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/pwd-test

Catchpoint 1 (forked process 2233), 0xb7fe2424 in __kernel_vsyscall () (gdb) catch exec // 对比测试[1] Catchpoint 2 (exec) (gdb) c Continuing. [New process 2233] Executing new program: /bin/pwd

Catchpoint 2 (exec'd /bin/pwd), 0xb7fe3850 in ?? () from /lib/ld-linux.so.2 (gdb) bt // 此时已经是pwd进程

0 0xb7fe3850 in ?? () from /lib/ld-linux.so.2

(gdb) x/i $eip-2 0xb7fe384e: add %al,(%eax) (gdb) c Continuing. /tmp

Program exited normally. (gdb) q


A: scz 2016-02-26 11:35

一般说调试子进程,不只是对付fork(),还得对付exec():

set follow-fork-mode child set follow-exec-mode new

如果不支持后者,可用catch exec,断下时已切入新进程映像,前提是exec()成功。 此时离start()、main()还远。如果OS支持DTrace,可以在proc:::exec-success探点 stop()。假设fork()之后的exec*()失败,用"catch exec"根本断不下来。

(gdb) set follow-fork-mode child|parent (gdb) set follow-exec-mode new|same (gdb) set detach-on-fork on|off (gdb) info inferiors (gdb) inferior (gdb) detach inferiors

detach-on-fork缺省为on,此时根据follow-fork-mode的设置进行调试,父子进程中 必有一个被detach。若设成off,父子进程都处在GDB的控制之下,follow-fork-mode 指定的进程可以正常调试,另一个则被挂起。可以用"info inferiors"查看父子进程, 用"inferior "在父子进程之间切换。"detach inferiors "之后相应 inferior被释放,此时有可能发生inferior转移,比如当前inferior为2, "detach inferiors 1"之后,inferior 1被释放,同时当前inferior自动切成1。位于 一个空闲inferior时,可以attach之前被detach的进程。

在64-bits内核上,32-bits进程可以通过exec*()执行64-bits进程。用64-bits GDB可 以调试前者,此时针对前者的"set follow-exec-mode new"仍然有效,后者会因 SIGTRAP而中断,如果想继续调试后者,务必切换architecture。

(gdb) set follow-exec-mode new (gdb) c Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.

务必切换CPU架构,否则无法继续跟踪:

(gdb) show architecture The target architecture is set automatically (currently i386) (gdb) set architecture i386:x86-64 The target architecture is assumed to be i386:x86-64