[PATCH] D158613: [AArch64] Mark known zero for high 16-bits of uaddlv intrinsic output with v8i8
JinGu Kang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 23 06:35:26 PDT 2023
jaykang10 created this revision.
jaykang10 added reviewers: dmgreen, efriedma, t.p.northover.
Herald added subscribers: hiraditya, kristof.beyls.
Herald added a project: All.
jaykang10 requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
If `llvm.aarch64.neon.uaddlv` intrinsic has `v8i8` type input, the it returns 16-bits value.
clang generates `llvm.aarch64.neon.uaddlv.i32.v8i8` and `trunc to i16` for `vaddlv_u8` neon intrinsic. It causes additional `and 0xffff` instruction from attached example as below.
foo: // @foo
uaddlv h0, v0.8b
fmov w8, s0
and w0, w8, #0xffff
ret
If we mark know zero for high 16-bits of uaddlv intrinsic output with v8i8, we can avoid the additional `and 0xfff`.
foo: // @foo
uaddlv h0, v0.8b
fmov w0, s0
ret
https://reviews.llvm.org/D158613
Files:
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/neon-addlv.ll
Index: llvm/test/CodeGen/AArch64/neon-addlv.ll
===================================================================
--- llvm/test/CodeGen/AArch64/neon-addlv.ll
+++ llvm/test/CodeGen/AArch64/neon-addlv.ll
@@ -150,3 +150,16 @@
%tmp5 = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> %tmp3)
ret i32 %tmp5
}
+
+declare i32 @llvm.aarch64.neon.uaddlv.i32.v8i8(<8 x i8>) nounwind readnone
+
+define i32 @uaddlv_known_bits(<8 x i8> %a) {
+; CHECK-LABEL: uaddlv_known_bits:
+; CHECK: // %bb.0:
+; CHECK-NEXT: uaddlv h0, v0.8b
+; CHECK-NEXT: fmov w0, s0
+; CHECK-NEXT: ret
+ %tmp1 = tail call i32 @llvm.aarch64.neon.uaddlv.i32.v8i8(<8 x i8> %a)
+ %tmp2 = and i32 %tmp1, 65535
+ ret i32 %tmp2
+}
Index: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
===================================================================
--- llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2162,6 +2162,16 @@
switch (IntNo) {
default:
break;
+ case Intrinsic::aarch64_neon_uaddlv: {
+ MVT VT = Op.getOperand(1).getValueType().getSimpleVT();
+ unsigned BitWidth = Known.getBitWidth();
+ if (VT == MVT::v8i8) {
+ assert(BitWidth >= 16 && "Unexpected width!");
+ APInt Mask = APInt::getHighBitsSet(BitWidth, BitWidth - 16);
+ Known.Zero |= Mask;
+ }
+ break;
+ }
case Intrinsic::aarch64_neon_umaxv:
case Intrinsic::aarch64_neon_uminv: {
// Figure out the datatype of the vector operand. The UMINV instruction
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D158613.552691.patch
Type: text/x-patch
Size: 1543 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230823/063cd908/attachment.bin>
More information about the llvm-commits
mailing list