[llvm] [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (PR #88512)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed May 8 01:51:27 PDT 2024
================
@@ -4145,6 +4156,24 @@ bool AMDGPULegalizerInfo::legalizeCTLZ_CTTZ(MachineInstr &MI,
return true;
}
+bool AMDGPULegalizerInfo::legalizeCTLZ_ZERO_UNDEF(MachineInstr &MI,
+ MachineRegisterInfo &MRI,
+ MachineIRBuilder &B) const {
+ Register Dst = MI.getOperand(0).getReg();
+ Register Src = MI.getOperand(1).getReg();
+ LLT SrcTy = MRI.getType(Src);
+ TypeSize NumBits = SrcTy.getSizeInBits();
+
+ assert(NumBits < 32u);
+
+ auto ShiftAmt = B.buildConstant(S32, 32u - NumBits);
+ auto Extend = B.buildAnyExt(S32, {Src}).getReg(0u);
+ auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt).getReg(0u);
----------------
arsenm wrote:
```suggestion
auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt);;
```
https://github.com/llvm/llvm-project/pull/88512
More information about the llvm-commits
mailing list