标题: 定位Python built-in函数的源码实现
https://scz.617.cn/python/201509211021.txt
Q:
我想查看os.system()的源码
import os os.system
os.file '/usr/lib/python2.7/os.pyc' import inspect inspect.getfile(os) '/usr/lib/python2.7/os.pyc' inspect.getmodule(os)
这些信息不是我想要的。
A: scz 2015-09-21 10:21
gdb -q --args /usr/bin/python2.7-dbg -c 'import os;os.system("echo $$;/bin/sleep 1")'
built-in函数名等于system时断下:
(gdb) b PyCFunction_Call if (0==strcmp(((PyCFunctionObject )func)->m_ml->ml_name,"system"))
Breakpoint 1 at 0x80bc2a1: file ../Objects/methodobject.c, line 73.
(gdb) r
...
Breakpoint 1, PyCFunction_Call (func=
PyCFunction_Call()用于调用built-in函数。
(gdb) output func
即将调用built-in函数system()。查看此时的模块名:
(gdb) output (PyCFunctionObject )func
{
_ob_next = 0xb7d30c68,
_ob_prev = 0xb7d30cd8,
ob_refcnt = 4,
ob_type = 0x82b9780
模块名不是想像中的os,而是posix,意味着os模块调用了posix模块。
针对PyCFunction_Call()设置条件断点时,不要轻易检查m_module,除非确知其名。
查看此时的函数信息:
(gdb) output (((PyCFunctionObject )func)->m_ml)
{
ml_name = 0x826e8c9 "system",
ml_meth = 0x81f5ea9
built-in函数system()对应的C实现是posix_system():
(gdb) list posix_system 2791 "system(command) -> exit_status\n\n\ 2792 Execute the command (a string) in a subshell."); 2793 2794 static PyObject * 2795 posix_system(PyObject self, PyObject args) 2796 { 2797 char *command; 2798 long sts; 2799 if (!PyArg_ParseTuple(args, "s:system", &command)) 2800 return NULL; (gdb) info source Current source file is ../Modules/posixmodule.c Compilation directory is /home/packages/python/2.7/d/python2.7-2.7.9/build-debug Located in /usr/src/python/python2.7-2.7.9/Modules/posixmodule.c Contains 9512 lines. Source language is c. Compiled with DWARF 2 debugging format. Does not include preprocessor macro info.
现在知道os.system()对应posixmodule.c中的posix_system()。我一般在 Source Insight里查看精确版本的源码实现,如果想快速了解大概,可以在线查看:
http://svn.python.org/projects/python/trunk/Modules/posixmodule.c http://hg.python.org/cpython/file/2.7/Modules/posixmodule.c
(gdb) tb *posix_system Temporary breakpoint 2 at 0x81f5ea9: file ../Modules/posixmodule.c, line 2796. (gdb) c Continuing.
Temporary breakpoint 2, posix_system (self=
0 posix_system (self=, args=) at ../Modules/posixmodule.c:2796
1 0x080bc300 in PyCFunction_Call (func=, arg=('echo $$;/bin/sleep 1',), kw=0x0) at ../Objects/methodobject.c:81
2 0x08149d0b in call_function (pp_stack=0xbffff804, oparg=1) at ../Python/ceval.c:4033
3 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c5c034, for file , line 1, in (), throwflag=0) at ../Python/ceval.c:2679
4 0x08147a77 in PyEval_EvalCodeEx ... at ../Python/ceval.c:3265
5 0x0813d035 in PyEval_EvalCode ... at ../Python/ceval.c:667
6 0x08172b65 in run_mod ... at ../Python/pythonrun.c:1371
7 0x08172a71 in PyRun_StringFlags ... at ../Python/pythonrun.c:1334
8 0x081716de in ... at ../Python/pythonrun.c:975
9 0x0818983d in Py_Main (argc=3, argv=0xbffffc84) at ../Modules/main.c:584
(More stack frames follow...)
不要用"until *posix_system",until有个很微妙的限制:
(gdb) help until Execute until the program reaches a source line greater than the current or a specified location (same args as break command) within the current frame.
注意这句,"within the current frame"。一般意义上until addr并不等价于 tb addr;c。而我们通常想要的是后者。这与windbg的g addr命令有显著不同。以前 被坑过,跑飞了还奇怪为什么,辛辛苦苦得到的中间状态就这么没了。
其实,如果确知os.system()会调用C函数system(),可以直接针对后者下断,然后查 看调用栈回溯:
(gdb) tb *system Temporary breakpoint 3 at 0xb7fa9be0: file pt-system.c, line 27. (gdb) c Continuing.
Temporary breakpoint 3, system (line=0xb7c448fc "echo $$;/bin/sleep 1") at pt-system.c:27 27 { (gdb) bt 5
0 system (line=0xb7c448fc "echo $$;/bin/sleep 1") at pt-system.c:27
1 0x081f5efc in posix_system (self=0x0, args=('echo $$;/bin/sleep 1',)) at ../Modules/posixmodule.c:2802
2 0x080bc300 in PyCFunction_Call (func=, arg=('echo $$;/bin/sleep 1',), kw=0x0) at ../Objects/methodobject.c:81
3 0x08149d0b in call_function (pp_stack=0xbffff804, oparg=1) at ../Python/ceval.c:4033
4 0x081454ec in PyEval_EvalFrameEx (f=Frame 0xb7c5c034, for file , line 1, in (), throwflag=0) at ../Python/ceval.c:2679
(More stack frames follow...)
从中一眼就看到posixmodule.c的posix_system()。
我前面演示的复杂方案是广谱方案。