[PATCH] D113291: [AggressiveInstCombine] Lower Table Based CTTZ and enable it for AARCH64 in -O3

Tue Feb 1 14:55:47 PST 2022

craig.topper added a comment.

In D113291#3286784 <https://reviews.llvm.org/D113291#3286784>, @xbolva00 wrote:

> I am wondering about general direction..
>
> Is it worth it? On way to become compiler just for benchmarks?
>
> Spend compile time just to optimize one very very specific pattern from spec is bad thing imho.
>
> Can you show us some other real world “hits”? I assume any sane project already uses builtin to compute this value efficiently.

The same algorithm is documented here https://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup

It's also nearby in the code in primesieve from the recent tzcnt discusson. Around line 143 if you expand the context on Erat.hpp here https://github.com/kimwalisch/primesieve/pull/109/files Granted that code knows when to use the tzcnt builtin instead of that code. I'm only mentioning it to show it is a known way to implement tzcnt that is used in more than just spec.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113291/new/

https://reviews.llvm.org/D113291