[llvm] [InstCombine] Combine or-disjoint (and->mul), (and->mul) to and->mul (PR #136013)

Thu Jun 5 07:15:09 PDT 2025

jrbyrnes wrote:

> Could you please remind me what the larger context for these patches was?

AI workloads are bringing in a new feature called linear layout https://arxiv.org/html/2505.23819v1

The effect of this feature is to rework address calculations s.t. we are using `xor` and `and` in places where we may more typically use `add` and `mul`. There are important clients that use this feature which are MLIR based and directly produce IR (instead of going through clang) (e.g. Triton).

The problem is that `separateConstOffsetFromGEP` doesn't extract constant offsets from xor. Moreover, since many of the `xor` are equivalent to `or disjoint`, it would be awesome to convert these `xor` to `or disjoint`. Doing so actually provides a very significant performance uplift as it significantly reduces RP and avoids spilling in some cases. The problem is that most of these address computation chains are longer than the knownBits recursion depth limit. 

The test in https://github.com/llvm/llvm-project/pull/137721 has a reduced example of this, and I've also included some IR in https://discourse.llvm.org/t/rfc-computeknownbits-recursion-depth/85962 . The problem is more general, and there are different common variants of address formulations that aren't included in these examples, but this gives a basic idea.

I think that from a solution perspective, it would be best to change the way the recursion depth works s.t. we always do these conversions. However, I realize there may be some concerns with that approach, so I'm also working on an approach that optimizes the address calculation chains s.t. they are compatible with the recursion depth.

That is where this stack of instcombine patches comes in: they clean up the intermediate code in the address calculations so we can convert these `xor` to `or-disjoint` within the depth limit. This solution is a bit less stable, but it will resolve the current performance issues that occur when adopting this technology. 

https://github.com/llvm/llvm-project/pull/136013