[PATCH] D45821: [AArch64] improve code generation of vectors smaller than 64 bit
Sebastian Pop via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 19 08:58:07 PDT 2018
sebpop created this revision.
sebpop added reviewers: eli.friedman, kristof.beyls, javed.absar, evandro.
Herald added subscribers: hiraditya, rengolin.
This changes the legalization of small vectors v2i8, v4i8, v2i16 from integer
promotion (i.e., v4i8 -> v4i16) to vector widening (i.e., v4i8 -> v8i8.)
This allows the AArch64 backend to select larger vector instructions
for middle-end vectors with fewer lanes.
In the example below, aarch64 does not have an add for v4i8;
after widening the backend is able to match that with the add for v8i8.
The widened lanes are not used in the final result, and the back-end
knows how to keep those lanes "undef"ed.
With this change we are now able to lower the cost of SLP and loop vectorization
factor from 64 bit to 16 bit.
Here is an example of SLP vectorization:
void fun(char *restrict out, char *restrict in) {
*out++ = *in++ + 1;
*out++ = *in++ + 2;
*out++ = *in++ + 3;
*out++ = *in++ + 4;
}
with this patch we now generate vector code:
fun:
ldr s0, [x1]
adrp x8, .LCPI0_0
ldr d1, [x8, :lo12:.LCPI0_0]
add v0.8b, v0.8b, v1.8b
st1 { v0.s }[0], [x0]
ret
when we used to generate scalar code:
fun:
ldrb w8, [x1]
add w8, w8, #1
strb w8, [x0]
ldrb w8, [x1, #2]
add w8, w8, #3
ldrb w9, [x1, #1]
add w9, w9, #2
strb w9, [x0, #1]
strb w8, [x0, #2]
ldrb w8, [x1, #3]
add w8, w8, #4
strb w8, [x0, #3]
ret
https://reviews.llvm.org/D45821
Files:
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64Subtarget.h
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
Index: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -612,16 +612,6 @@
return LT.first * 2 * AmortizationCost;
}
- if (Ty->isVectorTy() && Ty->getVectorElementType()->isIntegerTy(8) &&
- Ty->getVectorNumElements() < 8) {
- // We scalarize the loads/stores because there is not v.4b register and we
- // have to promote the elements to v.4h.
- unsigned NumVecElts = Ty->getVectorNumElements();
- unsigned NumVectorizableInstsToAmortize = NumVecElts * 2;
- // We generate 2 instructions per vector element.
- return NumVectorizableInstsToAmortize * NumVecElts * 2;
- }
-
return LT.first;
}
Index: llvm/lib/Target/AArch64/AArch64Subtarget.h
===================================================================
--- llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -97,7 +97,7 @@
bool NegativeImmediates = true;
// Enable 64-bit vectorization in SLP.
- unsigned MinVectorRegisterBitWidth = 64;
+ unsigned MinVectorRegisterBitWidth = 16;
bool UseAA = false;
bool PredictableSelectIsExpensive = false;
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -11015,10 +11015,12 @@
TargetLoweringBase::LegalizeTypeAction
AArch64TargetLowering::getPreferredVectorAction(EVT VT) const {
MVT SVT = VT.getSimpleVT();
- // During type legalization, we prefer to widen v1i8, v1i16, v1i32 to v8i8,
- // v4i16, v2i32 instead of to promote.
- if (SVT == MVT::v1i8 || SVT == MVT::v1i16 || SVT == MVT::v1i32
- || SVT == MVT::v1f32)
+ // During type legalization, we prefer to widen v1i8, v2i8, v4i8, v1i16,
+ // v2i16, v1i32, v1f32 to v8i8, v4i16, v2i32, v2f32 instead of to promote.
+ if (SVT == MVT::v1i8 || SVT == MVT::v1i16 || SVT == MVT::v1i32 ||
+ SVT == MVT::v1f32
+ || SVT == MVT::v2i8 || SVT == MVT::v4i8 || SVT == MVT::v2i16
+ )
return TypeWidenVector;
return TargetLoweringBase::getPreferredVectorAction(VT);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D45821.143100.patch
Type: text/x-patch
Size: 2317 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180419/f44dffbe/attachment.bin>
More information about the llvm-commits
mailing list