[Mlir-commits] [mlir] [MLIR] Fix VectorEmulateNarrowType constant op mask bug (PR #116064)

Wed Nov 13 12:00:11 PST 2024

================
@@ -129,6 +132,45 @@ static FailureOr<Operation *> getCompressedMaskOp(OpBuilder &rewriter,
       auto denseAttr = DenseElementsAttr::get(newMaskType, newMaskValues);
       newMask = rewriter.create<arith::ConstantOp>(loc, newMaskType, denseAttr);
     }
+  } else if (constantOp) {
+    assert(shape.size() == 1 && "expected 1-D mask");
+    // Rearrange the original mask values to cover the whole potential loading
+    // region. For example, in the case of using byte-size for emulation, given
+    // the following mask:
+    //
+    //   %mask = vector.constant_mask [0, 1, 0, 1, 0, 0] : vector<6xi2>
+    //
+    // with front offset of 1, the mask will be padded zeros in the front and
+    // back so that its length is multiple of `scale` (and the total coverage
+    // size is mulitiple of bytes):
+    //   %new_mask = vector.constant_mask [0, 0, 1, 0, 1, 0, 0, 0] :
+    //   vector<8xi2>
+    //
+    // The %new_mask is now aligned with the effective loading area and can now
+    // be compressed.
+    SmallVector<bool> maskValues(intraDataOffset, false);
+    if (auto denseAttr =
+            mlir::dyn_cast<DenseIntElementsAttr>(constantOp.getValue())) {
+      for (auto value : denseAttr.getValues<bool>()) {
+        maskValues.push_back(value);
+      }
+      while (maskValues.size() < numElements * scale) {
+        maskValues.push_back(false);
+      }
+    } else {
+      return failure();
+    }
+    // Compressing by combining every `scale` elements:
----------------
hanhanW wrote:

I'd suggest to have a blank line right before this comment. Because it is clearer that the code is not part of the above big chunk of comment. The above comment does not explain any compression. IMO, we should either expand the comment or just make a clear line between the codes. The latter one looks okay to me because this is what the method aims to do (and it's already documented in function comments).

https://github.com/llvm/llvm-project/pull/116064