[llvm-bugs] [Bug 48033] New: [X86] Poor codegen with STMXCSR/LDMXCSR combo.
via llvm-bugs
llvm-bugs at lists.llvm.org
Sat Oct 31 05:23:18 PDT 2020
https://bugs.llvm.org/show_bug.cgi?id=48033
Bug ID: 48033
Summary: [X86] Poor codegen with STMXCSR/LDMXCSR combo.
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: andrea.dibiagio at gmail.com
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, pengfei.wang at intel.com,
spatel+llvm at rotateright.com
This is a spin-off of bug 48024.
```
void MXCSR_Crash()
{
const unsigned int PreviousMXCSR = _mm_getcsr();
_mm_setcsr(PreviousMXCSR & ~0x6000);
}
```
(https://gcc.godbolt.org/z/66cvvq)
Currently generates this:
stmxcsr -4(%rsp)
movl $-24577, %eax # imm = 0x9FFF
andl -4(%rsp), %eax
movl %eax, -8(%rsp)
ldmxcsr -8(%rsp)
retq
This codegen is sub-optimal. It is as if the compiler tried very hard to keep
alive the stack slot with the original value of MXCSR until the end of the
function.
This is suboptimal because it means that an extra stack slot (rsp - 8) has to
be used for the new value of MXCSR. If instead the original slot was reused,
the compiler could have emitted a MR variant of AND (read-modify-write), and it
would have avoided the use of an extra MOV.
GCC gets this right: the entire sequence is three instructions plus the RET.
stmxcsr -4(%rsp)
andl $-24577, -4(%rsp)
ldmxcsr -4(%rsp)
retq
I wonder if this poor codegen has to do with the fact that STMXCSR is defined
as having "unmodeled side-effects". Can it be that somehow that prevents the
compiler from commuting the original ADD and use a RMW variant instead?
Alternatively StackSlotColoring is not doing a good job at merging the two
stack slots. This is just me speculating on what the issue might be in the code
generator.
--
On the plus side, the compiler is smart at taking advantage of the red-zone in
this case. Part of me wasn't expecting to see negative offsets used with RSP.
In this particular case, it makes perfectly sense and it avoids having to emit
an extra SUB (of RSP) at the beginning, plus an extra ADD at the end.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20201031/0969a1ca/attachment.html>
More information about the llvm-bugs
mailing list