[llvm] [AArch64][GISel] Don't pointlessly lower G_TRUNC (PR #81479)

Nikita Popov via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 12 05:38:43 PST 2024


https://github.com/nikic created https://github.com/llvm/llvm-project/pull/81479

If we have something like G_TRUNC from v2s32 to v2s16, then lowering this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC from v2s16 to v2s8 does not bring us any closer to legality. In fact, the first part of that is a G_BUILD_VECTOR whose legalization will produce a new G_TRUNC from v2s32 to v2s16, and both G_TRUNCs will then get combined to the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the original vector is >128 bits, which is I believe the only case where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the alwaysLegal is a lie, as before), but it will cause a proper globalisel abort instead of an infinite legalization loop.

Fixes https://github.com/llvm/llvm-project/issues/81244.

>From bfc07cdeb570905d29f413f8c2c76dc21fb1677c Mon Sep 17 00:00:00 2001
From: Nikita Popov <npopov at redhat.com>
Date: Mon, 12 Feb 2024 14:31:24 +0100
Subject: [PATCH] [AArch64][GISel] Don't pointlessly lower G_TRUNC

If we have something like G_TRUNC from v2s32 to v2s16, then lowering
this to a concat of two G_TRUNC s32 to s16 followed by G_TRUNC
from v2s16 to v2s8 does not bring us any closer to legality. In
fact, the first part of that will get legalized back into G_TRUNC
from v2s32 to v2s16, and both G_TRUNC will then get combinde to
the original, causing a legalization cycle.

Make the lowering condition more precise, by requiring that the
original vector is >128 bits, which is I believe the only case
where this specific splitting approach is useful.

Note that this doesn't actually produce a legal result (the
alwaysLegal is a lie, as before), but it will cause a proper
globalisel abort instead of an infinite legalization loop.
---
 .../AArch64/GISel/AArch64LegalizerInfo.cpp    |  5 ++--
 .../AArch64/GlobalISel/legalize-xtn.mir       | 24 +++++++++++++++++++
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index cbf5655706e694..2ce1a7bda608e2 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -615,9 +615,8 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
       .lowerIf([=](const LegalityQuery &Query) {
         LLT DstTy = Query.Types[0];
         LLT SrcTy = Query.Types[1];
-        return DstTy.isVector() && (SrcTy.getSizeInBits() > 128 ||
-                                    (DstTy.getScalarSizeInBits() * 2 <
-                                     SrcTy.getScalarSizeInBits()));
+        return DstTy.isVector() && SrcTy.getSizeInBits() > 128 &&
+               DstTy.getScalarSizeInBits() * 2 <= SrcTy.getScalarSizeInBits();
       })
 
       .alwaysLegal();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
index 16b780a8397347..661265173ae82b 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-xtn.mir
@@ -529,3 +529,27 @@ body:             |
     RET_ReallyLR implicit $q0
 
 ...
+
+---
+name:            pr81244
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $d0
+    ; CHECK-LABEL: name: pr81244
+    ; CHECK: liveins: $d0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s32>) = COPY $d0
+    ; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(<2 x s8>) = G_TRUNC [[COPY]](<2 x s32>)
+    ; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<2 x s8>), [[TRUNC]](<2 x s8>)
+    ; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[CONCAT_VECTORS]](<4 x s8>)
+    ; CHECK-NEXT: $d0 = COPY [[ANYEXT]](<4 x s16>)
+    ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    %0:_(<2 x s32>) = COPY $d0
+    %1:_(<2 x s8>) = G_TRUNC %0(<2 x s32>)
+    %2:_(<4 x s8>) = G_CONCAT_VECTORS %1(<2 x s8>), %1(<2 x s8>)
+    %3:_(<4 x s16>) = G_ANYEXT %2(<4 x s8>)
+    $d0 = COPY %3(<4 x s16>)
+    RET_ReallyLR implicit $d0
+
+...



More information about the llvm-commits mailing list