16.30 用特定动态链接器和LIBC执行ELF
https://scz.617.cn/unix/202104091427.txt
Q:
在某x64环境中有个32位ELF,假设叫some。现将some及其依赖库(包含动态链接器)复 制到另一个环境中,第一想让some跑起来,第二想用gdb调试some及其依赖库。出现 过真实需求,不是伪需求。
在原环境中用"ldd some"确认some及其依赖库(包含动态链接器)如下:
some ld-2.12.2.so libc-2.12.2.so libcrypto.so.1.0.2 libdl-2.12.2.so
A: scz 2021-04-09 14:27
这个问题比较复杂,但确实有解。为了增加演示难度,将some迁移到一个不同发行版 x64环境中。若原环境与新环境的内核、GLIBC相差巨大,估计要歇菜,不考虑这种情 形。后续演示操作均在新环境中进行。
简单展示新环境:
$ uname -a Linux ... 3.10.0-862.14.4.el7.x86_64 #1 SMP ... x86_64 GNU/Linux
$ ldd $(which id) linux-vdso.so.1 => (0x00007ffff7ffa000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007ffff7bb4000) libc.so.6 => /lib64/libc.so.6 (0x00007ffff77e7000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007ffff7585000) libdl.so.2 => /lib64/libdl.so.2 (0x00007ffff7381000) /lib64/ld-linux-x86-64.so.2 (0x00007ffff7ddb000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffff7165000)
检查some及其依赖库:
$ chmod +x *
$ file -b some ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.9, not stripped
尽管一个内核是2.4.9,另一个内核是3.10.0,这种不算相差巨大。
$ ldd some not a dynamic executable
ldd无法用于some,换种方式:
$ LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some -bash: ./some: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory
提示很明确,some所用动态链接器不存在。参看:
《动态链接器符号链接被破坏后的灾难恢复》 https://scz.617.cn/unix/201809191202.txt
《查看/修改ELF的动态链接器》 https://scz.617.cn/unix/201907041603.txt
可以用patchelf查看、修改some所用动态链接器:
$ patchelf --print-interpreter some /lib/ld-linux.so.2
$ cp some some-new
$ patchelf --set-interpreter "./ld-2.12.2.so" some-new
$ patchelf --print-interpreter some-new ./ld-2.12.2.so
为什么改成ld-2.12.2.so?因为在原环境中some最终所用动态链接器就是它。
$ LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => not found libc.so.6 => not found Segmentation fault
已经有进展,可以看到依赖库,该LD_LIBRARY_PATH上场了。
$ LD_LIBRARY_PATH=. LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xf7dc8000) libc.so.6 => not found libdl.so.2 => not found libc.so.6 => not found Segmentation fault
只有libcrypto用了当前目录下的版本,libc、libdl仍然未找到,为什么?该 LD_DEBUG上场了。
$ LD_DEBUG=libs LD_LIBRARY_PATH=. LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new 123059: find library=libcrypto.so.1.0.2 [0]; searching 123059: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (LD_LIBRARY_PATH) 123059: trying file=./tls/i686/sse2/libcrypto.so.1.0.2 ... 123059: trying file=./libcrypto.so.1.0.2 123059: 123059: find library=libc.so.6 [0]; searching 123059: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (LD_LIBRARY_PATH) 123059: trying file=./tls/i686/sse2/libc.so.6 ... 123059: trying file=./libc.so.6 ... 123059: find library=libdl.so.2 [0]; searching ... 123059: trying file=./libdl.so.2 ... linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xf7dc8000) libc.so.6 => not found libdl.so.2 => not found libc.so.6 => not found Segmentation fault
some-new在找
./libc.so.6 ./libdl.so.2
当前目录下只有
libc-2.12.2.so libdl-2.12.2.so
所以找不到。用符号链接解决该问题:
$ ln -s libc-2.12.2.so libc.so.6 $ ln -s libdl-2.12.2.so libdl.so.2
再次奇技淫巧ldd:
$ LD_LIBRARY_PATH=. LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xf7dc8000) libc.so.6 => ./libc.so.6 (0xf7c66000) libdl.so.2 => ./libdl.so.2 (0xf7c62000) ./ld-2.12.2.so (0xf7fe0000)
假设新环境是x86而不是x64,且没有前述两个符号链接,很可能看到:
$ LD_LIBRARY_PATH=. LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new linux-gate.so.1 => (0xb7f79000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xb7d5e000) libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7b5a000) libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xb7b54000) ./ld-2.12.2.so (0xb7f7b000) Segmentation fault
此时some直接用新环境中的libc、libdl,原始需求是some只用来自原环境的东西, 若不注意就掉坑里了,可用LD_DEBUG了解发生了什么。无论新环境是x86还是x64,都 强烈不建议直接用ldd解决此类问题,应该用前面展示的技巧。
接下来测试gdb调试some-new:
$ gdb -q -nx ./some-new
(gdb) info files ... Entry point: 0x804ea38 ...
(gdb) x/15i 0x804ea38 0x804ea38 <_start>: xor %ebp,%ebp 0x804ea3a <_start+2>: pop %esi 0x804ea3b <_start+3>: mov %esp,%ecx 0x804ea3d <_start+5>: and $0xfffffff0,%esp 0x804ea40 <_start+8>: push %eax 0x804ea41 <_start+9>: push %esp 0x804ea42 <_start+10>: push %edx 0x804ea43 <_start+11>: push $0x80631c0 0x804ea48 <_start+16>: push $0x8063160 0x804ea4d <_start+21>: push %ecx 0x804ea4e <_start+22>: push %esi 0x804ea4f <_start+23>: push $0x804d200 0x804ea54 <_start+28>: call 0x804cef0 __libc_start_main@plt 0x804ea59 <_start+33>: hlt 0x804ea5a <_start+34>: nop
最初只用LD_LIBRARY_PATH和符号链接解决普通.so定位,未修改some所用动态链接器, 调试some时在__libc_start_main()中触发SIGSEGV,这是动态链接器与LIBC版本不匹 配所致。
(gdb) b *0x804d200 Breakpoint 1 at 0x804d200
在main()上设断
set environment LD_LIBRARY_PATH=. set startup-with-shell off
上面2步是必须的。严格来说第2步要视环境而定,不那么必须,比如在某些x86环境 下必须,在某些x64环境下不必须,但我懒得解释细节。
display/5i $pc set backtrace past-main on set backtrace past-entry on set pagination off set disassembly-flavor intel
上面几步不是必须的
(gdb) r
Breakpoint 1, 0x0804d200 in main ()
1: x/5i $pc
=> 0x804d200
0 0x0804d201 in main ()
1 0xf7c7eb67 in __libc_start_main () from ./libc.so.6
2 0x0804ea59 in _start ()
(gdb) info proc mappings ... Start Addr End Addr Size Offset objfile 0x8048000 0x8066000 0x1e000 0x0 /tmp/scz/some-new ... 0xf7c62000 0xf7c64000 0x2000 0x0 /tmp/scz/libdl-2.12.2.so ... 0xf7c66000 0xf7dc1000 0x15b000 0x0 /tmp/scz/libc-2.12.2.so ... 0xf7dc8000 0xf7fc0000 0x1f8000 0x0 /tmp/scz/libcrypto.so.1.0.2 ... 0xf7fe0000 0xf7ffc000 0x1c000 0x0 /tmp/scz/ld-2.12.2.so ...
A: scz & bluerust 2021-04-09
$ cp some some-new-2 $ patchelf --set-interpreter "./ld-2.12.2.so" some-new-2 $ patchelf --set-rpath "." some-new-2 $ patchelf --print-rpath some-new-2 .
这样理论上或可省去LD_LIBRARY_PATH,但有一些微妙的坑。
"patchelf --set-rpath"缺省设置的并不是DT_RPATH,而是DT_RUNPATH,现在ELF不 推荐使用DT_RPATH。
$ readelf -d some-new-2 | grep RPATH (无输出)
$ readelf -d some-new-2 | grep RUNPATH 0x0000001d (RUNPATH) Library runpath: [.]
设置DT_RUNPATH之后,some-new-2使用当前目录下的libcrypto、libc,但未使用当 前目录下的libdl。
$ LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new-2 linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xf7dc8000) libc.so.6 => ./libc.so.6 (0xf7c66000) ./ld-2.12.2.so (0xf7fe0000) libdl.so.2 => not found undefined symbol: dlclose, version GLIBC_2.0 (./libcrypto.so.1.0.2) undefined symbol: dlerror, version GLIBC_2.0 (./libcrypto.so.1.0.2) undefined symbol: dlsym, version GLIBC_2.0 (./libcrypto.so.1.0.2) undefined symbol: dladdr, version GLIBC_2.0 (./libcrypto.so.1.0.2) undefined symbol: dlopen, version GLIBC_2.1 (./libcrypto.so.1.0.2)
libdl没能找到,用LD_DEBUG看看发生了什么:
$ LD_DEBUG=libs LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new-2 54009: find library=libcrypto.so.1.0.2 [0]; searching 54009: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (RUNPATH from file ./some-new-2) ... 54009: trying file=./libcrypto.so.1.0.2 54009: 54009: find library=libc.so.6 [0]; searching 54009: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (RUNPATH from file ./some-new-2) ... 54009: trying file=./libc.so.6 54009: 54009: find library=libdl.so.2 [0]; searching 54009: search cache=/etc/ld.so.cache 54009: search path=/lib/tls/i686/sse2:/lib/tls/i686:/lib/tls/sse2:/lib/tls:/lib/i686/sse2:/lib/i686:/lib/sse2:/lib:/usr/lib/tls/i686/sse2:/usr/lib/tls/i686:/usr/lib/tls/sse2:/usr/lib/tls:/usr/lib/i686/sse2:/usr/lib/i686:/usr/lib/sse2:/usr/lib (system search path) ... 54009: linux-gate.so.1 => (0xf7fdf000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xf7dc8000) libc.so.6 => ./libc.so.6 (0xf7c66000) ./ld-2.12.2.so (0xf7fe0000) libdl.so.2 => not found ...
找libcrypto、libc时都用了RUNPATH,找libdl时未用RUNPATH,在x86、x64上测试均 如此。
$ cp some some-new-3 $ patchelf --set-interpreter "./ld-2.12.2.so" some-new-3 $ patchelf --force-rpath --set-rpath "." some-new-3
上述命令设置并不推荐的DT_RPATH,而非缺省的DT_RUNPATH。
$ readelf -d some-new-3 | grep RPATH 0x0000000f (RPATH) Library rpath: [.]
设置DT_RPATH之后,some-new-3使用当前目录下的libcrypto、libc、libdl,不再依 赖LD_LIBRARY_PATH。
$ LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some-new-3 linux-gate.so.1 => (0xb7f4f000) libcrypto.so.1.0.2 => ./libcrypto.so.1.0.2 (0xb7d34000) libc.so.6 => ./libc.so.6 (0xb7bd2000) libdl.so.2 => ./libdl.so.2 (0xb7bce000) ./ld-2.12.2.so (0xb7f51000)
找libdl时认RPATH、LD_LIBRARY_PATH,不认RUNPATH,libdl有什么说法?
搜索优先级上RPATH、LD_LIBRARY_PATH、RUNPATH依次递减,但官方不推荐用RPATH。
小结一下,若要绿色化某ELF,可行方案之一:
patchelf --set-interpreter "./ld-*.so" some patchelf --force-rpath --set-rpath "." some
将some及其依赖库(包含动态链接器)置于同一目录下,再用如下命令检查之:
LD_TRACE_LOADED_OBJECTS=1 LD_WARN=yes LD_BIND_NOW=yes ./some
D: scz & houyunsong 2021-04-12
$ readelf -d some | grep NEEDED 0x00000001 (NEEDED) Shared library: [libcrypto.so.1.0.2] 0x00000001 (NEEDED) Shared library: [libc.so.6]
$ readelf -d libcrypto.so.1.0.2 | grep NEEDED 0x00000001 (NEEDED) Shared library: [libdl.so.2] 0x00000001 (NEEDED) Shared library: [libc.so.6]
$ readelf -d libdl | grep NEEDED libdl-2.12.2.so libdl.so.2
$ readelf -d libdl-2.12.2.so | grep NEEDED 0x00000001 (NEEDED) Shared library: [libc.so.6] 0x00000001 (NEEDED) Shared library: [ld-linux.so.2]
$ readelf -d libc-2.12.2.so | grep NEEDED 0x00000001 (NEEDED) Shared library: [ld-linux.so.2]
some的直接依赖库只有libcrypto、libc,libdl是libcrypto的依赖库之一。
$ patchelf --print-needed some libcrypto.so.1.0.2 libc.so.6
"patchelf --print-needed"只显示目标ELF的直接依赖库,ldd及其变种会递归显示 目标ELF的所有依赖库。
合理猜测,递归时非直接依赖库不认RUNPATH,但认RPATH、LD_LIBRARY_PATH,官方 不推荐RPATH可能与此相关。libdl是some的非直接依赖库。
D: scz 2021-04-12 14:55
以为这样就行了,结果云海说有新麻烦。在原环境中some会以daemon形式运行,但在 新环境中执行Patch过的some-new-3,发现其自动结束,"ps auwx | grep some"找不 到进程,需要排查。
strace -v -i -f -ff -o some.log ./some-new-3
strace一般会产生大量输出,应该启用文件输出,并将父子进程的输出分隔开。在父 进程的strace输出中注意到文件:
/var/log/some.log /var/run/some.ctl /var/run/some.pid
在some.log尾部看到
Switching to daemon user A FATAL Error has occured: missing 'daemon' id, exiting
用IDA反汇编some-new-3,通过"missing 'daemon' id, exiting"交叉引用发现因为 getpwnam("daemon")失败,导致进程主动结束。
云海找到一个参数使some-new-3不试图进入daemon状态,暂时规避了该问题,但他希 望我能找出getpwnam("daemon")失败的原因并解决之。
为什么getpwnam("daemon")失败?这个函数只是在找名为daemon的用户,如果没有 daemon用户,确实会失败,但新环境/etc/passwd里有daemon用户。曾经怀疑原环境、 新环境passwd文件格式不同,但passwd文件格式多少年前已定型,这种可能性极低。 getpwnam()就是读取passwd填充结构,能有多复杂以致失败?
gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' ./some-new-3
此番涉及父子进程,为了调试子进程需要特殊设置
set follow-fork-mode child set follow-exec-mode new catch fork r
命中后
ni
确保在调试子进程。对getpwnam("daemon")的主调位置设断,单步跟踪,进入libc的 代码。先后到达过这些位置:
b __nscd_get_map_ref b __nss_lookup
(gdb) bt
0 0xf7d6e300 in __nscd_get_map_ref () from ./libc.so.6
1 0xf7d6b5d7 in nscd_getpw_r () from ./libc.so.6
2 0xf7d6b98d in __nscd_getpwnam_r () from ./libc.so.6
3 0xf7d050d1 in getpwnam_r@@GLIBC_2.1.2 () from ./libc.so.6
4 0xf7d04a8f in getpwnam () from ./libc.so.6
5 0x0804dc0a in main ()
(gdb) bt
0 0xf7d4bde0 in __nss_lookup () from ./libc.so.6
1 0xf7d4ce5c in __nss_passwd_lookup2 () from ./libc.so.6
2 0xf7d05135 in getpwnam_r@@GLIBC_2.1.2 () from ./libc.so.6
3 0xf7d04a8f in getpwnam () from ./libc.so.6
4 0x0804dc0a in main ()
没想到getpwnam()底层实现如此复杂,下载GLIBC源码,用Source Insight查看。
https://ftp.gnu.org/gnu/libc/glibc-2.12.2.tar.bz2 https://ftp.gnu.org/gnu/libc/glibc-2.1.2.tar.gz
硬是没找到getpwnam()的函数体,将就着看了看相关函数,不得要领。其中 __nscd_get_map_ref()看着像是在找/etc/passwd在内存中的映射,动态调试发现没 找到。
重看getpwnam(3),想到应该检查新环境中/etc/nsswitch.conf,别不是没配置files 项。但在nsswitch.conf中看到的是:
passwd: files sss
重看nsswitch.conf(5),注意到:
/lib/libnss_files.so.X implements "files" source.
意识到新环境当前目录下没有libnss_files库,getpwnam(3)为了读passwd,必须有 这个库,又一个天坑。用如下命令调试确认:
$ LD_DEBUG=libs ./some-new-3 ... 27081: transferring control: ./some-new-3 27081: 27081: find library=libnss_files.so.2 [0]; searching 27081: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (RPATH from file ./some-new-3) ... 27081: find library=libnss_dns.so.2 [0]; searching 27081: search path=./tls/i686/sse2:./tls/i686:./tls/sse2:./tls:./i686/sse2:./i686:./sse2:. (RPATH from file ./some-new-3) ... 27081: find library=libnss_myhostname.so.2 [0]; searching ...
因为没找到libnss_files,所以又尝试找libnss_dns、libnss_myhostname。从原环 境中析取libnss_files-2.12.2.so到新环境当前目录,建符号链接:
ln -s libnss_files-2.12.2.so libnss_files.so.2
再次执行
./some-new-3 ps auwx | grep some
已能看到daemon化的some。原始问题已经解决,下面多讨论一些东西。
getpwnam("daemon")失败原因至少有二:
a) daemon用户不存在,检查/etc/passwd b) libnss_files库未就位,用LD_DEBUG=libs检查
some-new-3何时加载libnss_files库?
gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' ./some-new-3
catch load nss_files
(gdb) bt
0 0xf7fef120 in _dl_debug_state () from ./ld-2.12.2.so
1 0xf7ff283c in dl_open_worker () from ./ld-2.12.2.so
2 0xf7fee756 in _dl_catch_error () from ./ld-2.12.2.so
3 0xf7ff2366 in _dl_open () from ./ld-2.12.2.so
4 0xf7d71992 in do_dlopen () from ./libc.so.6
5 0xf7fee756 in _dl_catch_error () from ./ld-2.12.2.so
6 0xf7d71a86 in dlerror_run () from ./libc.so.6
7 0xf7d71afb in __libc_dlopen_mode () from ./libc.so.6
8 0xf7d4bc85 in __nss_lookup_function () from ./libc.so.6
9 0xf7d4bdff in __nss_lookup () from ./libc.so.6
10 0xf7d4cc7c in __nss_hosts_lookup2 () from ./libc.so.6
11 0xf7d51e46 in gethostbyname_r@@GLIBC_2.1.2 () from ./libc.so.6
12 0xf7d51566 in gethostbyname () from ./libc.so.6
13 0x0805e9c8 in ... ()
14 0x080554e1 in ... ()
15 0x0804d382 in main ()
父进程就会加载libnss_files库。"catch load"只有加载成功时才会命中,若想拦载 所有加载.so的企图,比如库不存在,但想知道在哪儿试图加载,用"b *do_dlopen"。
删掉符号链接做第二个实验:
rm -f libnss_files.so.2
gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' ./some-new-3
b *_start set follow-fork-mode child set follow-exec-mode new catch fork r
命中_start()后增设断点
b *do_dlopen c
命中后查看调用栈回溯
(gdb) bt
0 0xf7d71930 in do_dlopen () from ./libc.so.6
1 0xf7fee756 in _dl_catch_error () from ./ld-2.12.2.so
2 0xf7d71a86 in dlerror_run () from ./libc.so.6
3 0xf7d71afb in __libc_dlopen_mode () from ./libc.so.6
4 0xf7d4bc85 in __nss_lookup_function () from ./libc.so.6
5 0xf7d4bdff in __nss_lookup () from ./libc.so.6
6 0xf7d4cc7c in __nss_hosts_lookup2 () from ./libc.so.6
7 0xf7d51e46 in gethostbyname_r@@GLIBC_2.1.2 () from ./libc.so.6
8 0xf7d51566 in gethostbyname () from ./libc.so.6
9 0x0805e9c8 in ... ()
10 0x080554e1 in ... ()
11 0x0804d382 in main ()
(gdb) x/s ((char***)($esp+4)) 0xffffcfe0: "libnss_files.so.2"
父进程中"b *do_dlopen"还有两次命中,分别对应libnss_dns、libnss_myhostname。
继续调试,直至"catch fork"命中
ni c
子进程中"b *do_dlopen"再次命中,对应libnss_sss。
(gdb) bt
0 0xf7d71930 in do_dlopen () from ./libc.so.6
1 0xf7fee756 in _dl_catch_error () from ./ld-2.12.2.so
2 0xf7d71a86 in dlerror_run () from ./libc.so.6
3 0xf7d71afb in __libc_dlopen_mode () from ./libc.so.6
4 0xf7d4bc85 in __nss_lookup_function () from ./libc.so.6
5 0xf7d4be34 in __nss_lookup () from ./libc.so.6
6 0xf7d4ce5c in __nss_passwd_lookup2 () from ./libc.so.6
7 0xf7d05135 in getpwnam_r@@GLIBC_2.1.2 () from ./libc.so.6
8 0xf7d04a8f in getpwnam () from ./libc.so.6
9 0x0804dc0a in main ()
(gdb) x/s ((char***)($esp+4)) 0xffffd140: "libnss_sss.so.2"
好像处理/etc/passwd的是__nss_passwd_lookup2(),未进一步确认。
some-new-3未显式调用dlopen(),gethostbyname()、getpwnam()隐式调用do_dlopen()。
不要用"b dlopen"。libc中可能没有名为dlopen的符号,"b dlopen"可能实际断在 其他库的"dlopen@plt"上,不够底层,很可能拦不住你想要的东西。
(gdb) info symbol dlopen dlopen@plt in section .plt of ./libcrypto.so.1.0.2
(gdb) info symbol do_dlopen do_dlopen in section .text of ./libc.so.6
关于这方面的讨论,参看:
《未知网络服务分析之调试技巧》 https://scz.617.cn/unix/201812111322.txt
恢复符号链接做第三个实验:
ln -s libnss_files-2.12.2.so libnss_files.so.2
gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' ./some-new-3
set follow-fork-mode child set follow-exec-mode new catch fork r
命中后
ni b do_dlopen b __nscd_get_map_ref b *__nss_lookup
因父进程已成功加载libnss_files,子进程的"b *do_dlopen"不会命中,其余两个断 点仍会依次命中。c之后Ctrl-C断不下来,但可以从其他终端"kill -INT"。
父进程的strace日志中能看到加载libnss_files失败,但这是事后诸葛亮,毕竟有很 多失败的系统调用并不真地影响功能,不大可能提前知道哪次失败是致命的。
假设some-new-3自动结束,但没有/var/log/some.log可供排查,此时只能尝试 "b *_exit",待命中后查看调用栈回溯,这是普适方案。
rm -f libnss_files.so.2
gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' ./some-new-3
set follow-fork-mode child set follow-exec-mode new catch fork r
命中后
ni b *_exit c
(gdb) bt
0 0xf7d06464 in _exit () from ./libc.so.6
1 0xf7c95b9a in __run_exit_handlers () from ./libc.so.6
2 0xf7c95bdf in exit () from ./libc.so.6
3 0x080609ef in ... ()
4 0x0804de88 in main ()
收一下,本案例强调,检查ELF的依赖库,不要只用ldd或其变种技巧,要考虑动态加 载尤其是隐式动态加载的情形,"LD_DEBUG=libs"更有效。但是,"LD_DEBUG=libs"看 不到子进程试图动态加载的库,除非export后对子进程也用之,"strace -f -ff"可 以看到子进程试图动态加载的库。
A: John Reiser 2004-06-29
可以不Patch出some-new、some-new-3,直接调试"动态链接器+some"。
$ gdb -q -nx ./ld-2.12.2.so
(gdb) disas _dl_start_user Dump of assembler code for function _dl_start_user: 0x000010b7 <+0>: mov %eax,%edi 0x000010b9 <+2>: call 0x10a0 0x000010be <+7>: add $0x1bf36,%ebx 0x000010c4 <+13>: mov -0x188(%ebx),%eax 0x000010ca <+19>: pop %edx 0x000010cb <+20>: lea (%esp,%eax,4),%esp 0x000010ce <+23>: sub %eax,%edx 0x000010d0 <+25>: push %edx 0x000010d1 <+26>: mov 0x2c(%ebx),%eax 0x000010d7 <+32>: lea 0x8(%esp,%edx,4),%esi 0x000010db <+36>: lea 0x4(%esp),%ecx 0x000010df <+40>: mov %esp,%ebp 0x000010e1 <+42>: and $0xfffffff0,%esp 0x000010e4 <+45>: push %eax 0x000010e5 <+46>: push %eax 0x000010e6 <+47>: push %ebp 0x000010e7 <+48>: push %esi 0x000010e8 <+49>: xor %ebp,%ebp 0x000010ea <+51>: call 0xe970 <_dl_init_internal> 0x000010ef <+56>: lea -0xe2d4(%ebx),%edx 0x000010f5 <+62>: mov (%esp),%esp 0x000010f8 <+65>: jmp *%edi 0x000010fa <+67>: lea 0x0(%esi),%esi End of assembler dump.
display/5i $pc set backtrace past-main on set backtrace past-entry on set pagination off set disassembly-flavor intel set startup-with-shell off b *_dl_init_internal r --library-path . ./some
"--library-path ."相当于"set environment LD_LIBRARY_PATH=."
断在_dl_init_internal()时检查内存映射:
(gdb) info proc mappings process 13445 Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x8066000 0x1e000 0x0 /tmp/scz/some
0x8066000 0x8067000 0x1000 0x1e000 /tmp/scz/some
0x8067000 0x8068000 0x1000 0x1f000 /tmp/scz/some
...
确认some的入口点(e_entry):
(gdb) shell objdump -f some | grep start start address 0x0804ea38
确认main()所在:
(gdb) x/15i 0x0804ea38 0x804ea38: xor ebp,ebp 0x804ea3a: pop esi 0x804ea3b: mov ecx,esp 0x804ea3d: and esp,0xfffffff0 0x804ea40: push eax 0x804ea41: push esp 0x804ea42: push edx 0x804ea43: push 0x80631c0 0x804ea48: push 0x8063160 0x804ea4d: push ecx 0x804ea4e: push esi 0x804ea4f: push 0x804d200 0x804ea54: call 0x804cef0 0x804ea59: hlt 0x804ea5a: nop (gdb) b *0x804d200 Breakpoint 2 at 0x804d200 (gdb) c Continuing.
Breakpoint 2, 0x0804d200 in ?? () 1: x/5i $pc => 0x804d200: push ebp 0x804d201: mov ebp,esp 0x804d203: push edi 0x804d204: push esi 0x804d205: push ebx
已经在some中,但没有符号,可以手工加载符号。参看:
《2.49 GDB加载调试信息》
确定some的.text、.data基址:
(gdb) shell objdump -h some | grep -F .text 11 .text 00015ffc 0804d200 0804d200 00005200 24 (gdb) shell objdump -h some | grep -F .data 21 .data 000000ac 08067000 08067000 0001f000 25
手工加载符号:
(gdb) add-symbol-file some 0x804d200 -s .data 0x8067000
add symbol table from file "some" at
.text_addr = 0x804d200
.data_addr = 0x8067000
(gdb) x/5i $pc
=> 0x804d200
0 0x0804d200 in main ()
1 0xf7c7eb67 in __libc_start_main () from ./libc.so.6
2 0x0804ea59 in _start ()
A: scz 2021-04-10 14:36
John Reiser的办法很好,可以优化。
$ gdb -q -nx ./ld-2.12.2.so
display/5i $pc set backtrace past-main on set backtrace past-entry on set pagination off set disassembly-flavor intel set startup-with-shell off b *_dl_start_user r --library-path . ./some
断在_dl_start_user()
(gdb) disas _dl_start_user Dump of assembler code for function _dl_start_user: => 0xf7fe10b7 <+0>: mov edi,eax 0xf7fe10b9 <+2>: call 0xf7fe10a0 0xf7fe10be <+7>: add ebx,0x1bf36 0xf7fe10c4 <+13>: mov eax,DWORD PTR [ebx-0x188] 0xf7fe10ca <+19>: pop edx 0xf7fe10cb <+20>: lea esp,[esp+eax4] 0xf7fe10ce <+23>: sub edx,eax 0xf7fe10d0 <+25>: push edx 0xf7fe10d1 <+26>: mov eax,DWORD PTR [ebx+0x2c] 0xf7fe10d7 <+32>: lea esi,[esp+edx4+0x8] 0xf7fe10db <+36>: lea ecx,[esp+0x4] 0xf7fe10df <+40>: mov ebp,esp 0xf7fe10e1 <+42>: and esp,0xfffffff0 0xf7fe10e4 <+45>: push eax 0xf7fe10e5 <+46>: push eax 0xf7fe10e6 <+47>: push ebp 0xf7fe10e7 <+48>: push esi 0xf7fe10e8 <+49>: xor ebp,ebp 0xf7fe10ea <+51>: call 0xf7fee970 <_dl_init_internal> 0xf7fe10ef <+56>: lea edx,[ebx-0xe2d4] 0xf7fe10f5 <+62>: mov esp,DWORD PTR [esp] 0xf7fe10f8 <+65>: jmp edi 0xf7fe10fa <+67>: lea esi,[esi+0x0] End of assembler dump.
最后那个"jmp edi"就是跳转到some的e_entry。
b *_dl_start_user+65 c x/15i $edi
参看:
《24.27 在main()之前执行的函数》 https://scz.617.cn/unix/201507251602.txt
《Unix系列(4)--Unix反调试技术》(Unix_1.txt) before _start()
大致有下面这些流程:
ld-linux.so e_entry _dl_start // 返回"normal e_entry" _dl_start_final _dl_sysdep_start dl_main // 处理LD_PRELOAD,加载fake.so,相关符号解析已被劫持 _dl_init call_init // 调用fake.so的.init_array[] normal e_entry