[llvm] [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (PR #88512)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon May 20 02:45:03 PDT 2024
================
@@ -4145,6 +4156,25 @@ bool AMDGPULegalizerInfo::legalizeCTLZ_CTTZ(MachineInstr &MI,
return true;
}
+bool AMDGPULegalizerInfo::legalizeCTLZ_ZERO_UNDEF(MachineInstr &MI,
+ MachineRegisterInfo &MRI,
+ MachineIRBuilder &B) const {
+ Register Dst = MI.getOperand(0).getReg();
+ Register Src = MI.getOperand(1).getReg();
+ LLT SrcTy = MRI.getType(Src);
+ TypeSize NumBits = SrcTy.getSizeInBits();
+
+ assert(NumBits < 32u);
+
+ auto ShiftAmt = B.buildConstant(S32, 32u - NumBits);
+ auto Extend = B.buildAnyExt(S32, {Src}).getReg(0u);
+ auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt);
----------------
jayfoad wrote:
No, this should be `buildShl`!
https://github.com/llvm/llvm-project/pull/88512
More information about the llvm-commits
mailing list