[llvm] [InterleavedAccessPass] Get round the unsupported large scalarize vectors (PR #88643)
via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 13 22:34:15 PDT 2024
https://github.com/vfdff created https://github.com/llvm/llvm-project/pull/88643
When build with option -msve-vector-bits=512, the return vaule of Subtarget->getMinSVEVectorSizeInBits() is 512;
While the MinElts is still 4 for <vscale x 4 x double> in getNumInterleavedAccesses, so it creates invalid
llvm.aarch64.sve.ld2.sret.nxv4f64, which need be splited. unlikely, the related custom spilting is not supported now.
Fix https://github.com/llvm/llvm-project/issues/88247
>From 6416c6d25b9a86819a14ff196a59d0592e49b001 Mon Sep 17 00:00:00 2001
From: zhongyunde 00443407 <zhongyunde at huawei.com>
Date: Fri, 12 Apr 2024 01:55:52 -0400
Subject: [PATCH] [InterleavedAccessPass] Get round the unsupported large
scalarize vectors
When build with option -msve-vector-bits=512, the return vaule of
Subtarget->getMinSVEVectorSizeInBits() is 512;
While the MinElts is still 4 for <vscale x 4 x double> in
getNumInterleavedAccesses, so it creates invalid
llvm.aarch64.sve.ld2.sret.nxv4f64, which need be splited.
unlikely, the related custom spilting is not supported now.
---
.../Target/AArch64/AArch64ISelLowering.cpp | 4 ++-
.../AArch64/sve-interleaved-accesses.ll | 25 +++++++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index f552f91929201c..24ce3a8121500d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -16433,7 +16433,9 @@ bool AArch64TargetLowering::lowerDeinterleaveIntrinsicToLoad(
if (UseScalable && !VTy->isScalableTy())
return false;
- unsigned NumLoads = getNumInterleavedAccesses(VTy, DL, UseScalable);
+ // Get around for the missing of split large size scalarize vectors.
+ // TODO: Support the custom split of large size scalarize vectors
+ unsigned NumLoads = getNumInterleavedAccesses(VTy, DL, false);
VectorType *LdTy =
VectorType::get(VTy->getElementType(),
diff --git a/llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll b/llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
index feb22aa1a37635..d0f773a6174d24 100644
--- a/llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
+++ b/llvm/test/Transforms/InterleavedAccess/AArch64/sve-interleaved-accesses.ll
@@ -491,6 +491,31 @@ define void @store_bfloat_factor2(ptr %ptr, <16 x bfloat> %v0, <16 x bfloat> %v1
ret void
}
+; Use 4 vliad llvm.vector.insert.nxv4f64.nxv2f64
+define { <vscale x 4 x double>, <vscale x 4 x double> } @deinterleave_nxptr_factor2(ptr nocapture readonly %ptr) #2 {
+; CHECK-LABEL: define { <vscale x 4 x double>, <vscale x 4 x double> } @deinterleave_nxptr_factor2(
+; CHECK-SAME: ptr nocapture readonly [[PTR:%.*]]) #[[ATTR2]] {
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr <vscale x 2 x double>, ptr [[PTR]], i64 0
+; CHECK-NEXT: [[LDN1:%.*]] = call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sve.ld2.sret.nxv2f64(<vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 true, i64 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), ptr [[TMP1]])
+; CHECK-NEXT: [[TMP2:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[LDN1]], 0
+; CHECK-NEXT: [[TMP3:%.*]] = call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP2]], i64 0)
+; CHECK-NEXT: [[TMP4:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[LDN1]], 1
+; CHECK-NEXT: [[TMP5:%.*]] = call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> poison, <vscale x 2 x double> [[TMP4]], i64 0)
+; CHECK-NEXT: [[TMP6:%.*]] = getelementptr <vscale x 2 x double>, ptr [[PTR]], i64 2
+; CHECK-NEXT: [[LDN2:%.*]] = call { <vscale x 2 x double>, <vscale x 2 x double> } @llvm.aarch64.sve.ld2.sret.nxv2f64(<vscale x 2 x i1> shufflevector (<vscale x 2 x i1> insertelement (<vscale x 2 x i1> poison, i1 true, i64 0), <vscale x 2 x i1> poison, <vscale x 2 x i32> zeroinitializer), ptr [[TMP6]])
+; CHECK-NEXT: [[TMP7:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[LDN2]], 0
+; CHECK-NEXT: [[TMP8:%.*]] = call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP3]], <vscale x 2 x double> [[TMP7]], i64 2)
+; CHECK-NEXT: [[TMP9:%.*]] = extractvalue { <vscale x 2 x double>, <vscale x 2 x double> } [[LDN2]], 1
+; CHECK-NEXT: [[TMP10:%.*]] = call <vscale x 4 x double> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x double> [[TMP5]], <vscale x 2 x double> [[TMP9]], i64 2)
+; CHECK-NEXT: [[TMP11:%.*]] = insertvalue { <vscale x 4 x double>, <vscale x 4 x double> } poison, <vscale x 4 x double> [[TMP8]], 0
+; CHECK-NEXT: [[TMP12:%.*]] = insertvalue { <vscale x 4 x double>, <vscale x 4 x double> } [[TMP11]], <vscale x 4 x double> [[TMP10]], 1
+; CHECK-NEXT: ret { <vscale x 4 x double>, <vscale x 4 x double> } [[TMP12]]
+;
+ %wide.vec = load <vscale x 8 x double>, ptr %ptr, align 8
+ %ldN = tail call { <vscale x 4 x double>, <vscale x 4 x double> } @llvm.experimental.vector.deinterleave2.nxv8f64(<vscale x 8 x double> %wide.vec)
+ ret { <vscale x 4 x double>, <vscale x 4 x double> } %ldN
+}
+
attributes #0 = { vscale_range(2,2) "target-features"="+sve" }
attributes #1 = { vscale_range(2,4) "target-features"="+sve" }
attributes #2 = { vscale_range(4,4) "target-features"="+sve" }
More information about the llvm-commits
mailing list