[PATCH] D113489: [AArch64][SVE] Instcombine SVE LD1/ST1 to stock LLVM IR

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 9 09:46:50 PST 2021


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:707-709
+  auto MaskedLoad = Builder.CreateMaskedLoad(VecOp, VScalePointer,
+                                             PointerOp->getPointerAlignment(DL),
+                                             II.getOperand(0));
----------------
I also just noticed you're not specifying the `Passthru` parameter for `CreateMaskedLoad`.  This is very important because the `ld1` intrinsics are defined to zero the inactive lanes and so you need to pass in an explicit zero vector here. You'll see similar in `instCombineLD1GatherIndex`.


================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:718
+             m_Intrinsic<Intrinsic::aarch64_sve_ptrue>(m_Value(Predicate))))
+    return None;
+  if (!match(Predicate, m_ConstantInt<AArch64SVEPredPattern::all>()))
----------------
MattDevereau wrote:
> paulwalker-arm wrote:
> > Can we use a masked load for this scenario also? As in masked_load doesn't care how the mask was constructed.
> > 
> > My feeling is `instCombineSVEMaskedLD1` and `instCombineSVELD1` are called in the wrong order.  It seems easier to always call  `instCombineSVEMaskedLD1` and then call `instCombineSVELD1` for the case when
> > ```
> > match(II.getOperand(0),
> >              m_Intrinsic<Intrinsic::aarch64_sve_ptrue>(m_ConstantInt<AArch64SVEPredPattern::all>()))
> > ```
> > is true?
> Using the match you suggested and always calling `instCombineSVEMaskedLD1` would end up instCombining and erasing `aarch64.sve.convert.from.svbool`, predicates I think? Is this ok?
I don't understand your reasoning here.

`aarch64.sve.convert.from.svbool` is just an intrinsic that produces a predicate.  If that is the predicate operand to the `ld1` then it should end up being the mask operand of the masked_load.

For the case where the predicate operand is specifically `ptrue 31` we can drop the predicate operand and call `Builder.CreateLoad` with all other cases passing all the operands to `Builder.CreateMaskedLoad`.  Perhaps that means we don't even need instCombineSVELD1.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113489/new/

https://reviews.llvm.org/D113489



More information about the llvm-commits mailing list