[cfe-dev] "Optimized implementations"?
Stefan Kanthak via cfe-dev
cfe-dev at lists.llvm.org
Sun Sep 6 06:08:58 PDT 2020
<https://compiler-rt.llvm.org/index.html> boasts:
| The builtins library provides optimized implementations of this
| and other low-level routines, either in target-independent C form,
| or as a heavily-optimized assembly.
Really?
Left: inperformant code shipped in # Right: proper code, just one or
clang_rt.builtins-* # two bits faster and shorter
___paritysi2:
mov eax, [esp+4] # mov ax, [esp+4]
mov ecx, eax #
shr ecx, 16 #
xor ecx, eax # xor ax, [esp+6]
mov eax, ecx #
shr eax, 8 #
xor eax, ecx # xor al, ah
mov ecx, eax #
shr ecx, 4 #
xor ecx, eax #
mov eax, 0x6996 #
and cl, 15 #
shr eax, cl # setnp al
and eax, 1 # movzx eax, al
ret # ret
___paritydi2:
mov eax, [esp+8] # mov ax, [esp+4]
xor eax, [esp+4] # xor ax, [esp+6]
push eax # xor ax, [esp+8]
call ___paritysi2 # xor ax, [esp+10]
add esp, 4 # xor al, ah
# setnp al
# movzx eax, al
ret # ret
The proper code needs 14 instead of 21 instructions in 48 instead of 57
bytes for both functions together, more than halving the instructions
executed per function call!
AGAIN:
Remove every occurance of the word "optimized" on the above web page.
'nuff said
Stefan
More information about the cfe-dev
mailing list