2.24 如何用GDB调试子进程
https://scz.617.cn/unix/201206011217.txt
Q:
如何用GDB调试子进程?这个问题被问了十几年了,上个世纪就不断地被各种人在各 种论坛上问起。
gdb提供了一条命令,set follow-fork-mode child,本意是fork()之后跟踪进子进 程。但当年这个功能是系统相关的,在x86/Linux、x86/FreeBSD、SPARC/Solaris上 无效,据说有人在HP-UX上用成功过。
hhuu@SMTH提供过一个方案:
1)
在fork()处设置断点,修改fork()的返回值为0,迫使父进程去走子进程的流程。这 种做法并没有真正进入已经存在的子进程空间,但对付简单情形非常有效。
2)
如果能修改源代码,就很好办了,在fork()之后的子进程流程中增加阻塞类代码,比 如sleep()、getchar()等等,然后用gdb attach调试子进程。不过一般问此问题的人 都是无法修改源代码的,说了等于没说。
另有人就此问题写过一段内容:
gdb对调试fork()产生的子进程没有很多支持。一般fork()之后,gdb继续对父进程进 行调试,子进程将不受影响地运行。如果你在子进程流程中设了断点,当断点命中时 子进程将收到SIGTRAP信号,如果子进程没有对这个信号进行处理,缺省行为就是使 子进程终止。
多么悲催的历史,不知现状如何?
A: zz@nsfocus 2012-05
现在x86/Linux上的gdb已经可以正常使用"set follow-fork-mode child"了。
/ * gcc-3.3 -Wall -pipe -O3 -s -o follow-fork-mode-test follow-fork-mode-test.c * gcc-3.3 -Wall -pipe -O0 -ggdb -g -o follow-fork-mode-test follow-fork-mode-test.c /
include
include
int main ( int argc, char * argv[] ) { pid_t pid;
pid = fork();
if ( 0 == pid )
{
/*
* 子进程
*/
printf( "child = %u\n", getpid() );
execl( "./none", NULL );
}
else
{
/*
* 父进程
*/
printf( "parent = %u\n", getpid() );
}
return( 0 );
} / end of main /
$ uname -a Linux debian 3.4.0-scz-20120531 #2 SMP Thu May 31 15:47:42 CST 2012 i686 GNU/Linux
(gdb) show version GNU gdb (GDB) 7.0.1-debian
(gdb) help set follow-fork-mode Set debugger response to a program call of fork or vfork. A fork or vfork creates a new process. follow-fork-mode can be: parent - the original process is debugged after a fork child - the new process is debugged after a fork The unfollowed process will continue to run. By default, the debugger will follow the parent process.
(gdb) show follow-fork-mode Debugger response to a program call of fork or vfork is "parent".
注意,这里只支持child、parent,不支持传说中的ask。
调试父进程(这是缺省行为):
$ gdb ./follow-fork-mode-test (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/follow-fork-mode-test
Catchpoint 1 (forked process 2075), 0xb7fe2424 in __kernel_vsyscall () (gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) ni child = 2075 // 子进程已经独立执行过去了 0xb7fe2424 in __kernel_vsyscall () (gdb) i r eax eax 0x81b 2075 // fork()返回值大于0,现在是父进程 (gdb) i proc process 2072 cmdline = '/tmp/follow-fork-mode-test' cwd = '/tmp' exe = '/tmp/follow-fork-mode-test' (gdb) c Continuing. parent = 2072
Program exited normally. (gdb) q
调试子进程:
$ gdb ./follow-fork-mode-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/follow-fork-mode-test
Catchpoint 1 (forked process 2082), 0xb7fe2424 in __kernel_vsyscall () (gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) ni parent = 2079 // 父进程已经独立执行过去了 [New process 2082] 0xb7fe2425 in __kernel_vsyscall () (gdb) i r eax eax 0x0 0 // fork()返回值等于0,现在是子进程 (gdb) i proc process 2082 cmdline = '/tmp/follow-fork-mode-test' cwd = '/tmp' exe = '/tmp/follow-fork-mode-test' (gdb) c Continuing. child = 2082
Program exited normally. (gdb) q
D: scz@nsfocus 2012-06-01
发现调试follow-fork-mode-test时"catch exec"无效,"catch syscall execve"则 可以断下来。
排查后发现"catch exec"并不等同于"catch syscall execve",后者更底层。如果想 拦截fork()之后那一次exec*(),应该用后者,此时还在原进程空间中,尚未切入新 进程空间。前者断下来的时候,已经切入新进程空间。假设execve()失败,用前者根 本断不下来。
/ * gcc-3.3 -Wall -pipe -O3 -s -o pwd-test pwd-test.c * gcc-3.3 -Wall -pipe -O0 -ggdb -g -o pwd-test pwd-test.c /
include
include
int main ( int argc, char * argv[] ) { pid_t pid;
pid = fork();
if ( 0 == pid )
{
/*
* 子进程
*/
execl( "/bin/pwd", "", NULL );
}
return( 0 );
} / end of main /
$ gdb ./pwd-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/pwd-test
Catchpoint 1 (forked process 2218), 0xb7fe2424 in __kernel_vsyscall () (gdb) catch syscall execve // 对比测试[0] Catchpoint 2 (syscall 'execve' [11]) (gdb) c Continuing. [New process 2218]
Catchpoint 2 (call to syscall 'execve'), 0xb7fe2424 in __kernel_vsyscall () (gdb) bt // 此时还是pwd-test进程
0 0xb7fe2424 in __kernel_vsyscall ()
1 0xb7f19a3f in execve () from /lib/i686/cmov/libc.so.6
2 0xb7fc4ff4 in ?? () from /lib/i686/cmov/libc.so.6
(gdb) x/i $eip-2 0xb7fe2422 <__kernel_vsyscall+14>: int $0x80 (gdb) c Continuing. Executing new program: /bin/pwd
Catchpoint 2 (returned from syscall 'execve'), 0xb7fe3850 in ?? () from /lib/ld-linux.so.2 (gdb) bt // 此时已经是pwd进程
0 0xb7fe3850 in ?? () from /lib/ld-linux.so.2
(gdb) x/i $eip-2 0xb7fe384e: add %al,(%eax) (gdb) c Continuing. /tmp
Program exited normally. (gdb) q
$ gdb ./pwd-test (gdb) set follow-fork-mode child (gdb) catch fork Catchpoint 1 (fork) (gdb) r Starting program: /tmp/pwd-test
Catchpoint 1 (forked process 2233), 0xb7fe2424 in __kernel_vsyscall () (gdb) catch exec // 对比测试[1] Catchpoint 2 (exec) (gdb) c Continuing. [New process 2233] Executing new program: /bin/pwd
Catchpoint 2 (exec'd /bin/pwd), 0xb7fe3850 in ?? () from /lib/ld-linux.so.2 (gdb) bt // 此时已经是pwd进程
0 0xb7fe3850 in ?? () from /lib/ld-linux.so.2
(gdb) x/i $eip-2 0xb7fe384e: add %al,(%eax) (gdb) c Continuing. /tmp
Program exited normally. (gdb) q
A: scz 2016-02-26 11:35
一般说调试子进程,不只是对付fork(),还得对付exec():
set follow-fork-mode child set follow-exec-mode new
如果不支持后者,可用catch exec,断下时已切入新进程映像,前提是exec()成功。 此时离start()、main()还远。如果OS支持DTrace,可以在proc:::exec-success探点 stop()。假设fork()之后的exec*()失败,用"catch exec"根本断不下来。
(gdb) set follow-fork-mode child|parent
(gdb) set follow-exec-mode new|same
(gdb) set detach-on-fork on|off
(gdb) info inferiors
(gdb) inferior
detach-on-fork缺省为on,此时根据follow-fork-mode的设置进行调试,父子进程中
必有一个被detach。若设成off,父子进程都处在GDB的控制之下,follow-fork-mode
指定的进程可以正常调试,另一个则被挂起。可以用"info inferiors"查看父子进程,
用"inferior
在64-bits内核上,32-bits进程可以通过exec*()执行64-bits进程。用64-bits GDB可 以调试前者,此时针对前者的"set follow-exec-mode new"仍然有效,后者会因 SIGTRAP而中断,如果想继续调试后者,务必切换architecture。
(gdb) set follow-exec-mode new (gdb) c Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
务必切换CPU架构,否则无法继续跟踪:
(gdb) show architecture The target architecture is set automatically (currently i386) (gdb) set architecture i386:x86-64 The target architecture is assumed to be i386:x86-64