[llvm-dev] BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

Stefan Kanthak via llvm-dev llvm-dev at lists.llvm.org
Sun Nov 25 11:38:33 PST 2018


"Craig Topper" <craig.topper at gmail.com> wrote:

> bswapdi2 for i386 is correct

OUCH!

> Bits 31:0 of the source are loaded into edx. Bits 63:32 are loaded into
> eax. Those are each bswapped.

This exchanges the high byte of each 32-bit PART with its low byte, but
NOT the high byte of the whole 64-bit operand with its low byte!

Please get a clue!

> The ABI for the return is edx contains bits [63:32] and eax contains
> [31:0]. This is opposite of how the register were loaded.

My post is NOT about swapping EDX with EAX, but the bytes WITHIN both.

With the 64-bit argument loaded into EDX:EAX, the instruction sequence

    bswap  edx
    bswap  eax
    xchg   eax, edx

is NOT equivalent to

    bswap    rdi

with the 64-bit argument loaded into RDI.

Just run the following code on x86-64:

    mov    rdi, 0123456789abcdefh    ; pass (fake) argument in RDI
; split argument into high and low part
    mov    rdx, rdi
    shr    rdx, 32                   ; high part in EDX
    mov    eax, rdi                  ; low part in EAX
; perform __bswapdi2() as in 32-bit mode
    xchg   eax, edx                  ; swap parts, argument now loaded
                                     ;  like in 32-bit mode
    bswap  edx
    bswap  eax                       ; result like that in 32-bit mode
; load result into 64-bit register
    shl    rdx, 32
    or     rax, rdx
; perform _bswapdi2() in native 64-bit mode
    bswap  rdi
; compare results
    xor    rax, rdi

not amused
Stefan Kanthak

> On Sun, Nov 25, 2018 at 10:36 AM Craig Topper <craig.topper at gmail.com>
> wrote:
> 
>> bswapsi2 on the x86-64 isn't using the bswap instruction because "unsigned
>> long" is 64-bits on x86-64 linux. But its 32-bits on x86-64 msvc.
>>
>> Not sure about the bswapdi2 i386 case.
>>
>>
>> ~Craig
>>
>>
>> On Sun, Nov 25, 2018 at 8:03 AM Stefan Kanthak via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi @ll,
>>>
>>> targetting i386, LLVM/clang generates wrong code for the following
>>> functions:
>>>
>>> unsigned long __bswapsi2 (unsigned long ul)
>>> {
>>>     return (((ul) & 0xff000000ul) >> 3 * 8)
>>>          | (((ul) & 0x00ff0000ul) >>     8)
>>>          | (((ul) & 0x0000ff00ul) <<     8)
>>>          | (((ul) & 0x000000fful) << 3 * 8);
>>> }
>>>
>>> unsigned long long __bswapdi2(unsigned long long ull)
>>> {
>>>     return ((ull & 0xff00000000000000ull) >> 7 * 8)
>>>          | ((ull & 0x00ff000000000000ull) >> 5 * 8)
>>>          | ((ull & 0x0000ff0000000000ull) >> 3 * 8)
>>>          | ((ull & 0x000000ff00000000ull) >>     8)
>>>          | ((ull & 0x00000000ff000000ull) <<     8)
>>>          | ((ull & 0x0000000000ff0000ull) << 3 * 8)
>>>          | ((ull & 0x000000000000ff00ull) << 5 * 8)
>>>          | ((ull & 0x00000000000000ffull) << 7 * 8);
>>> }
>>>
>>> You can find these sources in "compiler-rt/lib/builtins/bswapsi2.c"
>>> and "compiler-rt/lib/builtins/bswapdi2.c", for example!
>>>
>>>
>>> Compiled with "-O3 -target i386" this yields the following code
>>> (see <https://godbolt.org/z/F4UIl4>):
>>>
>>> __bswapsi2: # @__bswapsi2
>>>     push  ebp
>>>     mov   ebp, esp
>>>     mov   eax, dword ptr [ebp + 8]
>>>     bswap eax
>>>     pop   ebp
>>>     ret
>>>
>>> __bswapdi2: # @__bswapdi2
>>>     push  ebp
>>>     mov   ebp, esp
>>>     mov   edx, dword ptr [ebp + 8]
>>>     mov   eax, dword ptr [ebp + 12]
>>>     bswap eax
>>>     bswap edx
>>>     pop   ebp
>>>     ret
>>>
>>> __bswapsi2() is correct, but __bswapdi2() NOT: swapping just the
>>> halves of a "long long" is OBVIOUSLY WRONG!
>>>
>>> From the C source, the expected result for the input value
>>> 0x0123456789ABCDEF is 0xEFCDAB8967452301; the compiled code but
>>> produces 0x67452301EFCDAB89
>>>
>>>
>>> And compiled for x86-64 this yields the following code (see
>>> <https://godbolt.org/z/uM9nvN>):
>>>
>>> __bswapsi2: # @__bswapsi2
>>>     mov   eax, edi
>>>     shr   eax, 24
>>>     mov   rcx, rdi
>>>     shr   rcx, 8
>>>     and   ecx, 65280
>>>     or    rax, rcx
>>>     mov   rcx, rdi
>>>     shl   rcx, 8
>>>     and   ecx, 16711680
>>>     or    rax, rcx
>>>     and   rdi, 255
>>>     shl   rdi, 24
>>>     or    rax, rdi
>>>     ret
>>>
>>> __bswapdi2: # @__bswapdi2
>>>     bswap rdi
>>>     mov   rax, rdi
>>>     ret
>>>
>>> Both are correct, but __bswapsi2() should of course use BSWAP too!
>>>
>>>
>>> Stefan Kanthak
>>>
>>> PS: for comparision with another compiler, take a look at
>>>     <https://skanthak.homepage.t-online.de/msvc.html#example5>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>


More information about the llvm-dev mailing list