[PATCH] D124834: [fastregalloc] Fix bug when undef value is tied to def.

Matthias Braun via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 3 21:53:34 PDT 2022


MatzeB added inline comments.


================
Comment at: llvm/test/CodeGen/X86/fastregalloc-tied-undef.mir:27
+
+    %0:vr128 = PXORrr undef %0, undef %0
+    MOVAPSmr %stack.1, 1, $noreg, 0, $noreg, %0
----------------
craig.topper wrote:
> craig.topper wrote:
> > MatzeB wrote:
> > > LuoYuanke wrote:
> > > > craig.topper wrote:
> > > > > craig.topper wrote:
> > > > > > LuoYuanke wrote:
> > > > > > > craig.topper wrote:
> > > > > > > > Was this a synthetic example? It shouldn't be possible to generate a PXORrr like this before register allocation.
> > > > > > > I use the following C code to generate the test case. However compiler would generate "%0:vr128 = V_SET0" instead of "%0:vr128 = PXORrr undef %0, undef %0". The real case that I encounter is to generate code to zero the stack slot for AMX configure register. I use "PXORrr" to zero the stack slot, I can change it to "V_SET0" to avoid the issue in fast regalloc. However I think it expose an issue in fast regalloc, so I create a patch for it.
> > > > > > > 
> > > > > > > ```
> > > > > > > #include <immintrin.h>
> > > > > > > 
> > > > > > > void foo() {
> > > > > > >   __m128 vec[4];
> > > > > > >   __m128 m = {0, 0};
> > > > > > > 
> > > > > > >   vec[0] = m;
> > > > > > >   vec[1] = m;
> > > > > > >   vec[2] = m;
> > > > > > >   vec[3] = m;
> > > > > > > }
> > > > > > > ```
> > > > > > Thanks. I recommend using V_SET0. If you create the PXOR while in SSA form the source and dest will need different vregs. I think the two address instruction pass will change the tied one to match the dest, but it won't change the other one. Then the register allocator is not obligated to give the 2 sources the same register. The untied source will likely always end up with xmm0. If the other source isn't xmm0 it won't be recognized as a zero idiom by the hardware. V_SET0 exists to work around all of that.
> > > > > Actually its worse than that. Not only will it not be recognized by the hardware. It won't produce 0. It will produce a random value since the register contents don't match.
> > > > Thanks, Craig :) I'll change the code. 
> > > > I think the two address instruction pass will change the tied one to match the dest, but it won't change the other one.
> > > 
> > > I don't think TwoAddressInstruction does that. I can't find the relevant code right now, but at least the explanation in MachineOperand.h for the undef flag says:
> > > 
> > > ```
> > >   /// Note that an instruction may have multiple <undef> operands referring to
> > >   /// the same register.  In that case, the instruction may depend on those
> > >   /// operands reading the same dont-care value.  For example:
> > >   ///
> > >   ///   %1 = XOR undef %2, undef %2
> > > ```
> > I could be wrong. I think the relevant code is in TwoAddressInstructionPass::collectTiedOperands. At first glance it doesn't look like it's trying to keep other operands the same when it rewrites undef tied operands.
> I hacked the test and ran the twoaddressinstruction pass
> 
> ```
> # *** IR Dump Before Two-Address instruction pass (twoaddressinstruction) ***:
> # Machine code for function foo: IsSSA, NoPHIs, TracksLiveness
> Frame Objects:
>   fi#0: size=64, align=16, at location [SP+8]
>   fi#1: size=16, align=16, at location [SP+8]
> 
> bb.0.entry:
>   %1:vr128 = PXORrr undef %0:vr128(tied-def 0), undef %0:vr128
>   MOVAPSmr %stack.1, 1, $noreg, 0, $noreg, %1:vr128
>   MOVAPSmr %stack.0, 1, $noreg, 0, $noreg, %1:vr128
>   MOVAPSmr %stack.0, 1, $noreg, 16, $noreg, %1:vr128
>   MOVAPSmr %stack.0, 1, $noreg, 32, $noreg, %1:vr128
>   MOVAPSmr %stack.0, 1, $noreg, 48, $noreg, killed %1:vr128
>   RET 0
> 
> # End machine code for function foo.
> 
> ********** REWRITING TWO-ADDR INSTRS **********
> ********** Function: foo
>                 rewrite undef:  %1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128
>         %1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128
> ```
> 
> It creates a PXORrr with different source registers.
Interesting! Guess twoaddressinstruction is buggy then given how we define the undef flag and people just got used to use the workaround with pseudo instructions instead...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124834/new/

https://reviews.llvm.org/D124834



More information about the llvm-commits mailing list