[PATCH] D124834: [fastregalloc] Fix bug when undef value is tied to def.
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 3 21:47:18 PDT 2022
craig.topper added inline comments.
================
Comment at: llvm/test/CodeGen/X86/fastregalloc-tied-undef.mir:27
+
+ %0:vr128 = PXORrr undef %0, undef %0
+ MOVAPSmr %stack.1, 1, $noreg, 0, $noreg, %0
----------------
MatzeB wrote:
> LuoYuanke wrote:
> > craig.topper wrote:
> > > craig.topper wrote:
> > > > LuoYuanke wrote:
> > > > > craig.topper wrote:
> > > > > > Was this a synthetic example? It shouldn't be possible to generate a PXORrr like this before register allocation.
> > > > > I use the following C code to generate the test case. However compiler would generate "%0:vr128 = V_SET0" instead of "%0:vr128 = PXORrr undef %0, undef %0". The real case that I encounter is to generate code to zero the stack slot for AMX configure register. I use "PXORrr" to zero the stack slot, I can change it to "V_SET0" to avoid the issue in fast regalloc. However I think it expose an issue in fast regalloc, so I create a patch for it.
> > > > >
> > > > > ```
> > > > > #include <immintrin.h>
> > > > >
> > > > > void foo() {
> > > > > __m128 vec[4];
> > > > > __m128 m = {0, 0};
> > > > >
> > > > > vec[0] = m;
> > > > > vec[1] = m;
> > > > > vec[2] = m;
> > > > > vec[3] = m;
> > > > > }
> > > > > ```
> > > > Thanks. I recommend using V_SET0. If you create the PXOR while in SSA form the source and dest will need different vregs. I think the two address instruction pass will change the tied one to match the dest, but it won't change the other one. Then the register allocator is not obligated to give the 2 sources the same register. The untied source will likely always end up with xmm0. If the other source isn't xmm0 it won't be recognized as a zero idiom by the hardware. V_SET0 exists to work around all of that.
> > > Actually its worse than that. Not only will it not be recognized by the hardware. It won't produce 0. It will produce a random value since the register contents don't match.
> > Thanks, Craig :) I'll change the code.
> > I think the two address instruction pass will change the tied one to match the dest, but it won't change the other one.
>
> I don't think TwoAddressInstruction does that. I can't find the relevant code right now, but at least the explanation in MachineOperand.h for the undef flag says:
>
> ```
> /// Note that an instruction may have multiple <undef> operands referring to
> /// the same register. In that case, the instruction may depend on those
> /// operands reading the same dont-care value. For example:
> ///
> /// %1 = XOR undef %2, undef %2
> ```
I could be wrong. I think the relevant code is in TwoAddressInstructionPass::collectTiedOperands. At first glance it doesn't look like it's trying to keep other operands the same when it rewrites undef tied operands.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124834/new/
https://reviews.llvm.org/D124834
More information about the llvm-commits
mailing list