[PATCH] D109065: [X86] combineX86ShufflesRecursively(): call SimplifyMultipleUseDemandedVectorElts() on after finishing recursing

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 8 03:07:54 PDT 2021


lebedev.ri added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:37921
+    // Which elements of Op do we demand?
+    SmallVector<int, 8> OpDemandedIdentityMask(Mask.size(), -1);
+    for (int MaskElt : Mask) {
----------------
lebedev.ri wrote:
> RKSimon wrote:
> > This seems to be really bulky for what its actually doing. I don't think we need to create this shuffle mask for instance, we should be able to create a demanded elts mask directly and then trunc/scale it for the input's size.
> > 
> > I keep meaning to create a scaleDemandedMask() common helper method as we have several places that would use it (e.g. SelectionDAG.computeKnownBits bitcast handling and other parts of value tracking).
> That is what what i initially came up with, and it's *much* uglier than this code :)
> I can do that again, but i'm not sure that will be be better.
Ok, how about this?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:37911
+
+    unsigned NumOpElts = Op.getValueType().getVectorNumElements();
+
----------------
RKSimon wrote:
> lebedev.ri wrote:
> > RKSimon wrote:
> > > Op might be a different width to the Root - see the "Widen any subvector shuffle inputs we've collected." code below.
> > I keep hitting the same pitfail.
> We still need to do this before the widenSubVector() code - otherwise we'll never be able to simplify any input that doesn't match RootSizeInBits, which are likely to be the most interesting cases imo.
I agree, but is this a correctness concern for this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109065/new/

https://reviews.llvm.org/D109065



More information about the llvm-commits mailing list