[llvm] [AMDGPU] Add intrinsic-based optimization for rotate and funnel shift patterns (PR #153406)

Tue Sep 30 17:56:00 PDT 2025

nhaehnle wrote:

> This intrinsic was explicitly removed in [82de129](https://github.com/llvm/llvm-project/commit/82de129ab8f723ba94d0026b54d76b11b2a9e4f9). This should not be reintroduced. It has identical semantics as llvm.fshr. We should not be introducing a new intrinsic for a bespoke handling of SALU vs. VALU optimization

There was a conversation about this before and I don't know if there was more conversation now in the last week, but the current version seems the most pragmatic approach to me.

IIRC, the most natural alternative was to have separate ISel patterns, but then certain optimizations would be missed because the pattern selection happens too late. And re-implementing those optimizations just out of some fairly arbitrary notion of purity seems wrong-headed to me.

In general, the fact that we have SelectionDAG + G-MIR + MIR sucks, and I'd say that the more codegen tasks can be done in LLVM IR, the better. If there is a problem with this strategy, it's that we do not want frontends to generate this intrinsic. Admittedly, the mere existence of the intrinsic makes it tempting for frontends to make this mistake, and probably we should do something about that. At a minimum, the .td file should have a comment to that effect on the intrinsic definition. More fundamentally, LLVM isn't set up to distinguish "codegen-only" intrinsics, although we have actually had them forever (llvm.amdgcn.if and friends). How about we make that distinction clearer somehow?

https://github.com/llvm/llvm-project/pull/153406