Skip to content

标题: "mov edx,0x01234567"导致rdx高32-bits清零

创建: 2017-08-14 17:05 链接: https://scz.617.cn/misc/201708141705.txt

起因是我想手工测试如下代码片段:

mov edx,0x01234567 neg rdx shl rdx,2

.dvalloc 0x1000 Allocated 1000 bytes starting at 0000000000060000 r $t0=000060000 !vprot @$t0 BaseAddress: 0000000000060000 AllocationBase: 0000000000060000 AllocationProtect: 00000040 PAGE_EXECUTE_READWRITE RegionSize: 0000000000001000 State: 00001000 MEM_COMMIT Protect: 00000040 PAGE_EXECUTE_READWRITE Type: 00020000 MEM_PRIVATE eb @$t0 ba 67 45 23 01 48 f7 da 48 c1 e2 02 u @$t0 l 3 0000000000060000 ba67452301 mov edx,1234567h 0000000000060005 48f7da neg rdx 00000000`00060008 48c1e202 shl rdx,2 r rip=@$t0 r rdx=0 p 3 r rdx rdx=fffffffffb72ea64

r rip=@$t0 r rdx=0xffffffff00000000 p 3 r rdx rdx=fffffffffb72ea64

意外发现,无论rdx初值是多少,最后结果都是0xfffffffffb72ea64。单步跟踪发现 "mov edx,0x01234567"导致rdx高32-bits清零,同时"mov dl,1"不影响rdx的高32+24 位。这颠覆了我多年的32位汇编经验,一度怀疑自己犯了什么低级错误。

请教hume,他指出:

《Intel 64 and IA-32 Architectures Software Developer Manual: Vol 1》 https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf

"3.4.1.1 General-Purpose Registers in 64-Bit Mode"小节有如下内容:


When in 64-bit mode, operand size determines the number of valid bits in the destination general-purpose register:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.


另,bluerust找到:

Why do most x64 instructions zero the upper part of a 32 bit register https://stackoverflow.com/questions/11177137/why-do-most-x64-instructions-zero-the-upper-part-of-a-32-bit-register

节录其最佳答案如下:


I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.


好吧,对64位汇编完全不熟,没有折腾过,今天这个坑对我来说太大了。我只是看一 段IDA的反汇编,读来读去觉得逻辑不自洽,完全没想到有这种坑。