[PATCH] D141464: [X86]: Match (xor TSize - 1, ctlz) to `bsr` instead of `lzcnt` + `xor`

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 16:02:45 PST 2023


craig.topper added a comment.

In D141464#4042773 <https://reviews.llvm.org/D141464#4042773>, @goldstein.w.n wrote:

> In D141464#4042695 <https://reviews.llvm.org/D141464#4042695>, @craig.topper wrote:
>
>> The bsr implementation in hardware also reads the destination register to return it if the other input is zero. This can create unintentional dependencies on older instructions.
>>
>> bsr is also a multiple uop instruction on some AMD CPUs such as Zen1, 2, and 3. According to uops.info its improved on Zen4.
>
> Would be fair to not do this on zen1/2/3. Could also make it last use only (where dst==src is almost guranteed) if the dependency
> on dst is a major concern, although for many intel x86 impls `lzcnt` also has a false-dep on dst so its mostly equivilent.

The false dependency was fixed on Skylake which released in 2015. Is it safe to assume the majority of users won’t be using a CPU older than that?



================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:51923
+
+  Op = DAG.getNode(X86ISD::BSR, DL, OpVT, Op);
+  if (VT == MVT::i8) {
----------------
This needs to be

```
  SDVTList VTs = DAG.getVTList(OpVT, MVT::i32);                                  
  Op = DAG.getNode(X86ISD::BSR, dl, VTs, Op);
```

We have X86ISD::BSR plumbed to also returns flags so it has a second i32 out put for the flags.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141464/new/

https://reviews.llvm.org/D141464



More information about the llvm-commits mailing list