[PATCH] D154953: [InstCombine] Remove the reminder loop if we know the mask is always true

Tue Jul 11 10:53:41 PDT 2023

nikic added a comment.

In D154953#4489861 <https://reviews.llvm.org/D154953#4489861>, @paulwalker-arm wrote:

> @nikic What's the rational for not being allow to use TTI during instcombine? TTI is used for target specific combines.

The general rationale is that it's a target-independent canonicalization pass. This isn't really relevant for this particular case (because the query is about legality rather than profitability).

The issue that is more relevant here is layering. The proper way to perform this fold is to teach isKnownToBeAPowerOfTwo() from ValueTracking about vscale. If you do that, you most likely don't even need any special code in InstCombine. However, we definitely don't want a TTI dependency in ValueTracking (which is used from literally everywhere, so we'd either have TTI dependencies everywhere, or get reduced functionality in most places). The information needs to be available in a target-independent way.

> I don't like the idea of having to decorate every function with information that is essentially constant for the target.

DataLayout would provide a way to avoid decorating individual functions.

> Changing the LangRef seems like a backward step given LLVM already supports non-power-of-two values of `vscale`, which will be much harder to re-add once lost.  That said, if no target supports non-power-of-two values of `vscale` then I'll not fight to keep such support if that's the consensus.

We shouldn't keep dead code around due to sunk costs. The dead code has ongoing costs (like, we wouldn't even have this conversation without it). Of course, is there are (current or foreseeable future) targets that have non-pow2 vscale then we should certainly keep it, but if not, then imho we should get rid of this at the root.

> As a halfway house, what if we changed the definition of `vscale_range` to imply vscale is power-of-two.  Sure that's still a loss of functionality but it's smaller and critically maintains the path for supporting arbitrary vscale values whilst ensuring existing targets can encode the power-of-two-ness within the IR without needing any code changes.

Sounds reasonable to me.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154953/new/

https://reviews.llvm.org/D154953