[llvm] [X86] Use RORX over SHR imm (PR #77964)

Wed Feb 7 09:26:30 PST 2024

Bryce-MW wrote:

As it is, this optimization is very rare. Outside of the function I was optimizing, I didn't see any other instances on my codebase at work. It looks like there [aren't any changes on other tests either](https://llvm-compile-time-tracker.com/compare.php?from=275729ae06d568e9589392c142a416fb8c2bb1a8&to=8cb180bc7b1f5e19358635d6aad756c0c090fb9e&stat=size-text). The only case I can really think of where this transformation happens is 1s complement folding (i.e. internet checksum calculation which is my use) and all the implementations of that that I have seen use inline assembly or a different (slightly) less efficient implementation.

I feel like it is still worthwhile to include since there is no other way (other than inline assembly) to convince the compiler to make this transformation.

https://github.com/llvm/llvm-project/pull/77964