[PATCH] D141464: [X86]: Match (xor TSize - 1, ctlz) to `bsr` instead of `lzcnt` + `xor`

Noah Goldstein via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 01:48:35 PST 2023


goldstein.w.n added a comment.

In D141464#4042695 <https://reviews.llvm.org/D141464#4042695>, @craig.topper wrote:

> The bsr implementation in hardware also reads the destination register to return it if the other input is zero. This can create unintentional dependencies on older instructions.
>
> bsr is also a multiple uop instruction on some AMD CPUs such as Zen1, 2, and 3. According to uops.info its improved on Zen4.

Would be fair to not do this on zen1/2/3. Could also make it last use only (where dst==src is almost guranteed) if the dependency
on dst is a major concern, although for many intel x86 impls `lzcnt` also has a false-dep on dst so its mostly equivilent.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141464/new/

https://reviews.llvm.org/D141464



More information about the llvm-commits mailing list