[PATCH] D124834: [fastregalloc] Fix bug when undef value is tied to def.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 3 21:31:29 PDT 2022


craig.topper added inline comments.


================
Comment at: llvm/test/CodeGen/X86/fastregalloc-tied-undef.mir:27
+
+    %0:vr128 = PXORrr undef %0, undef %0
+    MOVAPSmr %stack.1, 1, $noreg, 0, $noreg, %0
----------------
LuoYuanke wrote:
> craig.topper wrote:
> > Was this a synthetic example? It shouldn't be possible to generate a PXORrr like this before register allocation.
> I use the following C code to generate the test case. However compiler would generate "%0:vr128 = V_SET0" instead of "%0:vr128 = PXORrr undef %0, undef %0". The real case that I encounter is to generate code to zero the stack slot for AMX configure register. I use "PXORrr" to zero the stack slot, I can change it to "V_SET0" to avoid the issue in fast regalloc. However I think it expose an issue in fast regalloc, so I create a patch for it.
> 
> ```
> #include <immintrin.h>
> 
> void foo() {
>   __m128 vec[4];
>   __m128 m = {0, 0};
> 
>   vec[0] = m;
>   vec[1] = m;
>   vec[2] = m;
>   vec[3] = m;
> }
> ```
Thanks. I recommend using V_SET0. If you create the PXOR while in SSA form the source and dest will need different vregs. I think the two address instruction pass will change the tied one to match the dest, but it won't change the other one. Then the register allocator is not obligated to give the 2 sources the same register. The untied source will likely always end up with xmm0. If the other source isn't xmm0 it won't be recognized as a zero idiom by the hardware. V_SET0 exists to work around all of that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124834/new/

https://reviews.llvm.org/D124834



More information about the llvm-commits mailing list