[PATCH] D77881: [VectorUtils] add IR-level analysis for widening of shuffle mask

Sat Apr 11 03:10:45 PDT 2020

lebedev.ri added inline comments.

================
Comment at: llvm/lib/Analysis/VectorUtils.cpp:432-466
+  // Step through the input mask by splitting into Scale-sized subsections.
+  ScaledMask.clear();
+  for (int i = 0, e = NumElts; i != e; i += Scale) {
+    // Check if the elements that map to a widened mask element are consecutive.
+    int OutputElt;
+    for (int j = 0; j != Scale; ++j) {
+      int MaskElt = Mask[i + j];
----------------
Ok, in light of not accepting partial sentinel values, what are your thoughts on the following then:
```
  // Step through the input mask by splitting into Scale-sized subsections.
  ScaledMask.clear();
  ScaledMask.reserve(NumElts / Scale);

  for (ArrayRef<int> MaskSlice = Mask.take_front(Scale),
                     RemainingMaskElts = Mask.take_back(Mask.size() - Scale);
       !MaskSlice.empty(); MaskSlice = RemainingMaskElts.take_front(Scale),
                     RemainingMaskElts = RemainingMaskElts.take_back(
                         RemainingMaskElts.size() - Scale)) {
    assert((int)MaskSlice.size() == Scale && "Expected Scale-sized slice.");

    // The slice must be homogeneous.
    int OutputElt;

    if (MaskSlice.front() < 0) {
      // Negative values (undef or other "sentinel" values) must be equal across
      // the entire subsection.
      if (!is_splat(MaskSlice))
        return false;
      OutputElt = MaskSlice.front();
    } else {
      // A positive mask element must be cleanly divisible.
      if (MaskSlice.front() % Scale != 0)
        return false;
      // The elements of the subsection must be consecutive.
      auto ExpectedSlice =
          llvm::seq(MaskSlice.front(), MaskSlice.front() + Scale);
      assert(llvm::size(ExpectedSlice) == Scale && "Got wrong sequence.");
      if (!std::equal(adl_begin(MaskSlice), adl_end(MaskSlice),
                      adl_begin(ExpectedSlice)))
        return false;
      OutputElt = MaskSlice.front() / Scale;
    }

    // All narrow elements in this subsection map to the same wider element.
    ScaledMask.push_back(OutputElt);
  }
```
https://godbolt.org/z/yoLNmG

This results in ~Scale less divisions.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77881/new/

https://reviews.llvm.org/D77881