[llvm-bugs] [Bug 51222] New: Clobbered XMM registers are not preserved around Intel-style inline assembly blocks in MS-ABI functions
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Jul 26 18:23:42 PDT 2021
https://bugs.llvm.org/show_bug.cgi?id=51222
Bug ID: 51222
Summary: Clobbered XMM registers are not preserved around
Intel-style inline assembly blocks in MS-ABI functions
Product: new-bugs
Version: 12.0
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: skoulik at gmail.com
CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org
The issue is first observed with clang 10.0 bundled with MS Visual Studio 2019
on windows, but later confirmed with clang 7.0.1 on Linux (CentOS 7.7) and with
clang 12.0 bundled with Xcode 12.2 on Mac OS.
Here is the minimal reproducible example:
void test(void)
{
__asm
{
VPXOR YMM6, YMM6, YMM6
}
}
When compiled on windows with
clang-cl /O2 /FA -c test.cpp
it produces the following assembly (meta-information skipped for clarity)
#APP
vpxor ymm6, ymm6, ymm6
#NO_APP
ret
As you can see XMM6 is not preserved even though it is clobbered by vpxor
instruction.
If I pass the -mavx2 flag to the compiler, however
clang-cl /O2 -mavx2 /FA -c test.cpp
the produced assembly turns into
sub rsp, 24
vmovaps xmmword ptr [rsp], xmm6 # 16-byte Spill
#APP
vpxor ymm6, ymm6, ymm6
#NO_APP
vmovaps xmm6, xmmword ptr [rsp] # 16-byte Reload
add rsp, 24
vzeroupper
ret
XMM6 is now preserved.
The same issue is present on Linux and Mac OS. However ms_abi must be
explicitly stated now:
void __attribute__((ms_abi)) test(void)
{
__asm
{
VPXOR YMM6, YMM6, YMM6
}
}
Compiling on Linux with
clang -O2 -fasm-blocks -S test.cpp
produces
#APP
vpxor %ymm6, %ymm6, %ymm6
#NO_APP
retq
Compiling with
clang -O2 -mavx2 -fasm-blocks -S test.cpp
produces
subq $24, %rsp
vmovaps %xmm6, (%rsp) # 16-byte Spill
#APP
vpxor %ymm6, %ymm6, %ymm6
#NO_APP
vmovaps (%rsp), %xmm6 # 16-byte Reload
addq $24, %rsp
vzeroupper
retq
Compiling on Mac OS with
/System/Volumes/Data/Applications/Xcode_12.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
-O2 -fasm-blocks -S test.cpp
produces
pushq %rbp
movq %rsp, %rbp
## InlineAsm Start
vpxor %ymm6, %ymm6, %ymm6
## InlineAsm End
popq %rbp
retq
Compiling with
/System/Volumes/Data/Applications/Xcode_12.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
-O2 -mavx2 -fasm-blocks -S test.cpp
produces
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
vmovaps %xmm6, -16(%rbp) ## 16-byte Spill
## InlineAsm Start
vpxor %ymm6, %ymm6, %ymm6
## InlineAsm End
vmovaps -16(%rbp), %xmm6 ## 16-byte Reload
addq $16, %rsp
popq %rbp
vzeroupper
retq
Additional comments and observations.
- The issue only happens with Intel-style assembly blocks. Using gcc-style
inline assembly and explicitly mentioning the registers in the clobber list
produces the correct code.
- The real world code, of course, is much more involved and contains
cpuid-based branches for avx2 and non-avx2 platforms. That means that we must
compile without the -mavx2 switch to support both.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210727/e68d3d87/attachment.html>
More information about the llvm-bugs
mailing list