[PATCH] D157984: [llvm][ARM][Neon][big-endian] Fix incorrect indexing of lanes
Amilendra Kodithuwakku via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 15 07:32:36 PDT 2023
amilendra created this revision.
amilendra added a reviewer: john.brawn.
Herald added subscribers: arphaman, hiraditya, kristof.beyls.
Herald added a project: All.
amilendra requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
Fixes #19762 (https://bugs.llvm.org/show_bug.cgi?id=1976)
vrev64.32 reverses the order of 32-bit elements in each doubleword.
This results in the ith lane in the operand register ending up in the
jth lane of the destination register as follows.
i=0 -> j=1
i=1 -> j=0
i=2 -> j=3
i=3 -> j=2
Take this into consideration in ARM Neon Big-Endian code generation.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D157984
Files:
llvm/lib/Target/ARM/ARMInstrInfo.td
llvm/test/CodeGen/ARM/legalize-bitcast.ll
llvm/test/CodeGen/ARM/vget_lane-be.ll
Index: llvm/test/CodeGen/ARM/vget_lane-be.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/ARM/vget_lane-be.ll
@@ -0,0 +1,51 @@
+; RUN: llc < %s -mattr=+neon | FileCheck %s
+target triple = "armebv8a-arm-none-eabihf"
+
+define i32 @ele0(<4 x i32> %a) {
+entry:
+;CHECK-LABEL: ele0:
+;CHECK-NEXT: .fnstart
+;CHECK-NEXT: @ %bb.0:
+;CHECK-NEXT: vrev64.32 q8, q0
+;CHECK-NEXT: vmov.32 r0, d16[1]
+;CHECK-NEXT: bx lr
+
+ %vget_lane = extractelement <4 x i32> %a, i64 3
+ ret i32 %vget_lane
+}
+
+define i32 @ele1(<4 x i32> %a) {
+entry:
+;CHECK-LABEL: ele1:
+;CHECK-NEXT: .fnstart
+;CHECK-NEXT: @ %bb.0:
+;CHECK-NEXT: vrev64.32 q8, q0
+;CHECK-NEXT: vmov.32 r0, d16[0]
+;CHECK-NEXT: bx lr
+ %vget_lane = extractelement <4 x i32> %a, i64 2
+ ret i32 %vget_lane
+}
+
+define i32 @ele2(<4 x i32> %a) {
+entry:
+;CHECK-LABEL: ele2:
+;CHECK-NEXT: .fnstart
+;CHECK-NEXT: @ %bb.0:
+;CHECK-NEXT: vrev64.32 q8, q0
+;CHECK-NEXT: vmov.32 r0, d17[1]
+;CHECK-NEXT: bx lr
+ %vget_lane = extractelement <4 x i32> %a, i64 1
+ ret i32 %vget_lane
+}
+
+define i32 @ele3(<4 x i32> %a) {
+entry:
+;CHECK-LABEL: ele3:
+;CHECK-NEXT: .fnstart
+;CHECK-NEXT: @ %bb.0:
+;CHECK-NEXT: vrev64.32 q8, q0
+;CHECK-NEXT: vmov.32 r0, d17[0]
+;CHECK-NEXT: bx lr
+ %vget_lane = extractelement <4 x i32> %a, i64 0
+ ret i32 %vget_lane
+}
Index: llvm/test/CodeGen/ARM/legalize-bitcast.ll
===================================================================
--- llvm/test/CodeGen/ARM/legalize-bitcast.ll
+++ llvm/test/CodeGen/ARM/legalize-bitcast.ll
@@ -23,7 +23,7 @@
; CHECK-NEXT: .LBB0_1: @ %bb.1
; CHECK-NEXT: vldmia sp, {d16, d17} @ 16-byte Reload
; CHECK-NEXT: vrev32.16 q8, q8
-; CHECK-NEXT: vmov.f64 d16, d17
+; CHECK-NEXT: @ kill: def $d16 killed $d16 killed $q8
; CHECK-NEXT: vmov.32 r0, d16[0]
; CHECK-NEXT: add sp, sp, #28
; CHECK-NEXT: pop {r4}
Index: llvm/lib/Target/ARM/ARMInstrInfo.td
===================================================================
--- llvm/lib/Target/ARM/ARMInstrInfo.td
+++ llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -487,8 +487,12 @@
}]>;
def DSubReg_i32_reg : SDNodeXForm<imm, [{
assert(ARM::dsub_7 == ARM::dsub_0+7 && "Unexpected subreg numbering");
- return CurDAG->getTargetConstant(ARM::dsub_0 + N->getZExtValue()/2, SDLoc(N),
- MVT::i32);
+ if (CurDAG->getDataLayout().isBigEndian())
+ return CurDAG->getTargetConstant(ARM::dsub_1 - N->getZExtValue()/2, SDLoc(N),
+ MVT::i32);
+ else
+ return CurDAG->getTargetConstant(ARM::dsub_0 + N->getZExtValue()/2, SDLoc(N),
+ MVT::i32);
}]>;
def DSubReg_f64_reg : SDNodeXForm<imm, [{
assert(ARM::dsub_7 == ARM::dsub_0+7 && "Unexpected subreg numbering");
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D157984.550329.patch
Type: text/x-patch
Size: 2843 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230815/52c62dfc/attachment.bin>
More information about the llvm-commits
mailing list