[llvm] [SystemZ] Enable MachineCombiner for FP reassociation (PR #83546)

Ulrich Weigand via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 19 08:58:51 PDT 2024


uweigand wrote:

> I agree that now that the general reg/reg scheme doesn't seem worthwhile, it may not be worth using the CC-pseudos and the extra pass to convert them back. I tried to do the CC check in optimizeLoadInstr() instead of using CC-pseudos. Performance-wise I see no difference, and doing it this way only misses a few foldings, which seem acceptable:

OK, that looks good.

> The backwards search for CC-def should be ok compile-time-wise as many instructions clobber CC, but if there emerges an issue, Peephole-opt could probably be extended easily to maintin a LivePhysRegs tracker. The good thing is that we can trust the MBB live-in list for CC.

Also, if we use the approach below to have optimizeLoadInstr call foldMemoryOperand, then we already get the CC handling in foldMemoryOperandImpl, so wouldn't have to duplicate the implementation.
 
> I tried your suggestion to have optimizeLoadInstr() call foldMemoryOperand(), which in turn calls foldMemoryOperandImpl(). This was NFC on SPEC, and the only test changes I saw was that in SystemZ anyregcc.ll and stackmap.ll now fails. I am not sure about these, but it seems that foldMemoryOperand() optimizes these in foldPatchPoint and the result seems to be that some of the loads are folded into those instructions. I have not updated those tests yet as I am not sure if this is correct / desired on SystemZ (?).

I don't know what the specific differences are, but in general if this is OK on X86 it should be OK on SystemZ too.
 
> Since foldMemoryOperand() sets the memoperands on the returned MI, it wouldn't work to call foldMemoryOperandImpl() directly from optimizeLoadInstr(). 

I agree.  We should call foldMemoryOperand, basically the same as X86 does.

> > X86 also implements an unfoldMemoryOperand callback. I was wondering if this would allow reassociation after creating reg/mem instruction, but it looks like MachineCombiner never even uses that callback. Still, would we see any benefit from implementing it? (If it's unrelated to the combiner, that is of course a separate topic.)
> 
> It seems that LICM can do this to hoist the memory access out of the loop, but that will also make the register live across the loop, so not sure how useful it would be. I Haven't tried it.

OK, that's something we can maybe look at some later time, it isn't really related to this patch.


https://github.com/llvm/llvm-project/pull/83546


More information about the llvm-commits mailing list