[Mlir-commits] [mlir] [mlir][vector] Add mask elimination transform (PR #99314)

Fri Aug 2 14:11:09 PDT 2024

MacDue wrote:

> Does it make more sense now or am I missing something?

I think you are misunderstanding how this patch works. There are no lightweight canonicalizations added here (or any patterns really). Such lightweight canonicalizations already exist, and this code does not re-implement them, I was simply saying this code follows a similar idea.

Let's look at the actual code here:
```c++
  // Check for any dims that could be (partially) false before doing the more
  // expensive value bounds computations.
  SmallVector<UnknownMaskDim> unknownDims;
  for (auto [i, dimSize] : llvm::enumerate(createMaskOp.getOperands())) {
    if (auto intSize = getConstantIntValue(dimSize)) {
      // Mask not all-true for this dim.
      if (maskTypeDimScalableFlags[i] || intSize < maskTypeDimSizes[i])
        return failure();
    } else if (auto vscaleMultiplier = getConstantVscaleMultiplier(dimSize)) {
      // Mask not all-true for this dim.
      if (vscaleMultiplier < maskTypeDimSizes[i])
        return failure();
    } else {
      // Unknown (without further analysis).
      unknownDims.push_back(UnknownMaskDim{i, dimSize});
    }
  }
```
This is the part likened to canonicalization, but it's not a canonicalization. It's looking at all the operations of the `create_mask` and finding the unknown dimensions. Those are the dimensions that need to be solved via value-bounds analysis. The trick here is if any dimension is constant and not all-true, then we can exit early as that means the mask won't be all-true.

This prevents us from pointlessly doing value-bounds analysis is in cases like: 

```mlir
%mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1>
```

>From looking at `%c2` we can tell this is not going to be an all-true mask, so we don't need to run the value-bounds analysis for  
`%dynamicValue` (and will exit the transform early). Note that this `create_mask` still is not a `constant_mask` or an all-true/false mask, so no canonicalization will remove it before the "heavyweight" mask removal.

After that we go over the unknown dimensions and solve for them using value-bounds:
```c++
  for (auto [i, dimSize] : unknownDims) {
    // Compute the lower bound for the unknown dimension (i.e. the smallest
    // value it could be).
    FailureOr<ConstantOrScalableBound> dimLowerBound =
        vector::ScalableValueBoundsConstraintSet::computeScalableBound(
            dimSize, {}, vscaleRange.vscaleMin, vscaleRange.vscaleMax,
            presburger::BoundType::LB);
    if (failed(dimLowerBound))
      return failure();
    auto dimLowerBoundSize = dimLowerBound->getSize();
    if (failed(dimLowerBoundSize))
      return failure();
    if (dimLowerBoundSize->scalable) {
      // If the lower bound is scalable and < the mask dim size then this dim is
      // not all-true.
      if (dimLowerBoundSize->baseSize < maskTypeDimSizes[i])
        return failure();
    } else {
      // If the lower bound is a constant:
      // - If the mask dim size is scalable then this dim is not all-true.
      if (maskTypeDimScalableFlags[i])
        return failure();
      // - If the lower bound is < the _fixed-size_ mask dim size then this dim
      // is not all-true.
      if (dimLowerBoundSize->baseSize < maskTypeDimSizes[i])
        return failure();
    }
  }
```

The TLDR is this is just the "heavyweight" mask removal with a check to avoid doing pointless work.

https://github.com/llvm/llvm-project/pull/99314