[PATCH] D62890: [DAGCombiner] Improve tryStoreMergeOfExtracts by using double sized vector type before type legalized

Mon Aug 8 13:27:12 PDT 2022

nemanjai added a comment.

This code seems quite unnecessarily complex. I can achieve essentially the same results with something like this:

  diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
  index 36973f5bddb0..984e84ba6fdc 100644
  --- a/llvm/include/llvm/CodeGen/TargetLowering.h
  +++ b/llvm/include/llvm/CodeGen/TargetLowering.h
  @@ -939,6 +939,9 @@ public:
              (unsigned)VT.getSimpleVT().SimpleTy < array_lengthof(RegClassForVT));
       return VT.isSimple() && RegClassForVT[VT.getSimpleVT().SimpleTy] != nullptr;
     }
  +  virtual bool isTypeLegalForMemAccess(EVT VT) const {
  +    return isTypeLegal(VT);
  +  }

     class ValueTypeActionImpl {
       /// ValueTypeActions - For each value type, keep a LegalizeTypeAction enum
  diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  index 5e77317572af..6acde2a5ae91 100644
  --- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  +++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  @@ -18486,7 +18486,7 @@ bool DAGCombiner::tryStoreMergeOfExtracts(
         if (Ty.getSizeInBits() > MaximumLegalStoreInBits)
           break;

  -      if (TLI.isTypeLegal(Ty) &&
  +      if (TLI.isTypeLegalForMemAccess(Ty) &&
             TLI.canMergeStoresTo(FirstStoreAS, Ty, DAG.getMachineFunction()) &&
             TLI.allowsMemoryAccess(Context, DL, Ty,
                                    *FirstInChain->getMemOperand(), &IsFast) &&
  diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  index 862d2ebc75a6..ceed4c1ffc91 100644
  --- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  +++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
  @@ -17569,7 +17569,8 @@ bool PPCTargetLowering::allowsMisalignedMemoryAccesses(EVT VT,
         return true;
       if (Subtarget.hasVSX()) {
         if (VT != MVT::v2f64 && VT != MVT::v2i64 &&
  -          VT != MVT::v4f32 && VT != MVT::v4i32)
  +          VT != MVT::v4f32 && VT != MVT::v4i32 &&
  +          VT != MVT::v2f32 && VT != MVT::v2i32)
           return false;
       } else {
         return false;
  diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.h b/llvm/lib/Target/PowerPC/PPCISelLowering.h
  index 2fa6d45bfe1a..1f0051f8d273 100644
  --- a/llvm/lib/Target/PowerPC/PPCISelLowering.h
  +++ b/llvm/lib/Target/PowerPC/PPCISelLowering.h
  @@ -1101,6 +1101,11 @@ namespace llvm {
       EVT getOptimalMemOpType(const MemOp &Op,
                               const AttributeList &FuncAttributes) const override;

  +    bool isTypeLegalForMemAccess(EVT VT) const override {
  +      bool Ret = TargetLoweringBase::isTypeLegalForMemAccess(VT) || VT == MVT::v2i32 || VT == MVT::v2f32;
  +      return Ret;
  +    }
  +
       /// Is unaligned memory access allowed for the given type, and is it fast
       /// relative to software emulation.
       bool allowsMisalignedMemoryAccesses(

Sure, it produces some `vperm`'s with this test case, but I don't see an issue with that - in most cases that matter, the constant pool loads aren't likely to lead to a lot of cache misses.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62890/new/

https://reviews.llvm.org/D62890