[llvm] [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (PR #88512)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu May 2 12:45:07 PDT 2024
================
@@ -1270,13 +1270,22 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
.custom();
// The 64-bit versions produce 32-bit results, but only on the SALU.
- getActionDefinitionsBuilder({G_CTLZ_ZERO_UNDEF, G_CTTZ_ZERO_UNDEF})
- .legalFor({{S32, S32}, {S32, S64}})
- .clampScalar(0, S32, S32)
- .clampScalar(1, S32, S64)
- .scalarize(0)
- .widenScalarToNextPow2(0, 32)
- .widenScalarToNextPow2(1, 32);
+ getActionDefinitionsBuilder(G_CTLZ_ZERO_UNDEF)
+ .legalFor({{S32, S32}, {S32, S64}})
+ .customFor({{S32, S8}, {S32, S16}})
----------------
arsenm wrote:
It doesn't matter how many bits it was originally. You're shifting away the high bits, it doesn't need to be restricted to 8 or 16
https://github.com/llvm/llvm-project/pull/88512
More information about the llvm-commits
mailing list