[llvm] [AArch64][GlobalISel] Widen non-power2 element sizes for ctlz. (PR #189371)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 30 06:00:15 PDT 2026
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-aarch64
Author: David Green (davemgreen)
<details>
<summary>Changes</summary>
This addresses an illegal mutation kind, where gisel would hit an assert. It expands vector elements for non-power2 elements or elements less that i8 to a power of 2.
A fix to handle vector types correctly was needed in LegalizerHandler.
Fixes #<!-- -->185411
---
Full diff: https://github.com/llvm/llvm-project/pull/189371.diff
3 Files Affected:
- (modified) llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp (+4-3)
- (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+1)
- (modified) llvm/test/CodeGen/AArch64/ctlz.ll (+13)
``````````diff
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 72ca4380a630b..81ff99969e252 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -2833,15 +2833,16 @@ LegalizerHelper::widenScalar(MachineInstr &MI, unsigned TypeIdx, LLT WideTy) {
// The count is the same in the larger type except if the original
// value was zero. This can be handled by setting the bit just off
// the top of the original type.
- auto TopBit =
- APInt::getOneBitSet(WideTy.getSizeInBits(), CurTy.getSizeInBits());
+ auto TopBit = APInt::getOneBitSet(WideTy.getScalarSizeInBits(),
+ CurTy.getScalarSizeInBits());
MIBSrc = MIRBuilder.buildOr(
WideTy, MIBSrc, MIRBuilder.buildConstant(WideTy, TopBit));
// Now we know the operand is non-zero, use the more relaxed opcode.
NewOpc = TargetOpcode::G_CTTZ_ZERO_UNDEF;
}
- unsigned SizeDiff = WideTy.getSizeInBits() - CurTy.getSizeInBits();
+ unsigned SizeDiff =
+ WideTy.getScalarSizeInBits() - CurTy.getScalarSizeInBits();
if (Opcode == TargetOpcode::G_CTLZ_ZERO_UNDEF) {
// An optimization where the result is the CTLZ after the left shift by
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 61ba8960d526b..74b73687d9ad5 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -350,6 +350,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
{v4s32, v4s32}})
.widenScalarToNextPow2(1, /*Min=*/32)
.clampScalar(1, s32, s64)
+ .widenScalarOrEltToNextPow2OrMinSize(1, /*Min=*/8)
.clampNumElements(0, v8s8, v16s8)
.clampNumElements(0, v4s16, v8s16)
.clampNumElements(0, v2s32, v4s32)
diff --git a/llvm/test/CodeGen/AArch64/ctlz.ll b/llvm/test/CodeGen/AArch64/ctlz.ll
index b1b869ec9e1ff..b45b48d06071b 100644
--- a/llvm/test/CodeGen/AArch64/ctlz.ll
+++ b/llvm/test/CodeGen/AArch64/ctlz.ll
@@ -594,3 +594,16 @@ entry:
%s = call <4 x i128> @llvm.ctlz(<4 x i128> %d, i1 false)
ret <4 x i128> %s
}
+
+define <8 x i4> @v8i4(<8 x i4> %a) {
+; CHECK-LABEL: v8i4:
+; CHECK: // %bb.0:
+; CHECK-NEXT: movi v1.8b, #15
+; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
+; CHECK-NEXT: movi v1.8b, #4
+; CHECK-NEXT: clz v0.8b, v0.8b
+; CHECK-NEXT: sub v0.8b, v0.8b, v1.8b
+; CHECK-NEXT: ret
+ %r = call <8 x i4> @llvm.ctlz(<8 x i4> %a, i1 false)
+ ret <8 x i4> %r
+}
``````````
</details>
https://github.com/llvm/llvm-project/pull/189371
More information about the llvm-commits
mailing list