[PATCH] D142084: [RFC][X86][MemFold] Upgrade the mechanism of auto-generated Memory Folding Table

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 15 08:31:39 PST 2023


craig.topper added inline comments.


================
Comment at: llvm/lib/Target/X86/X86MemFoldTables.inc:4009
   {X86::VMINSSZrr_Intkz, X86::VMINSSZrm_Intkz, TB_NO_REVERSE},
-  {X86::VMOVAPDZ128rrk, X86::VMOVAPDZ128rmk, TB_NO_REVERSE|TB_ALIGN_16},
-  {X86::VMOVAPDZ256rrk, X86::VMOVAPDZ256rmk, TB_NO_REVERSE|TB_ALIGN_32},
-  {X86::VMOVAPDZrrk, X86::VMOVAPDZrmk, TB_NO_REVERSE|TB_ALIGN_64},
-  {X86::VMOVAPSZ128rrk, X86::VMOVAPSZ128rmk, TB_NO_REVERSE|TB_ALIGN_16},
-  {X86::VMOVAPSZ256rrk, X86::VMOVAPSZ256rmk, TB_NO_REVERSE|TB_ALIGN_32},
-  {X86::VMOVAPSZrrk, X86::VMOVAPSZrmk, TB_NO_REVERSE|TB_ALIGN_64},
+  {X86::VMOVAPDZ128rrk, X86::VMOVAPDZ128rmk, TB_ALIGN_16},
+  {X86::VMOVAPDZ256rrk, X86::VMOVAPDZ256rmk, TB_ALIGN_32},
----------------
yubing wrote:
> craig.topper wrote:
> > Would this allow unfolding a VMOVAPDZ128rmk to VMOVAPDZ128rrk + VMOVAPDZ128rm? That would be a bug.
> @craig.topper 
> https://discourse.llvm.org/t/auto-generate-the-memory-folding-tables/61100/19
> i remember we always fold whole load intrinsics into masked arithmetic. does it happen to masked movereg instruction as well?(only wholeload+ movereg => mask.load)
> 
> besides, do we need to add TB_NO_REVERSE for {X86::VMOVAPDZ128rr, X86::VMOVAPDZ128rm, TB_ALIGN_16}?
What I said previously applied to arithmetic that has loads folded into them.

VMOVAPDZ128rmk is created from a masked load with an isel pattern.


VMOVAPDZ128rm will never be unfolded. We unfold to separate a loop invariant load from arithmetic. There’s no arithmetic folded in to it.. the only operands are address operands. They would all need to be loop invariant to hoist so the whole instruction is hoistable.

For VMOVAPDZ128rmk there is a mask operand and address operands. Without TB_NO_REVERSE we would try to unfold it if the address was loop invariant but the mask wasn’t. This would be incorrect. The mask may be masking off elements that will fault.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142084/new/

https://reviews.llvm.org/D142084



More information about the llvm-commits mailing list