[PATCH] AARCH64_BE load/store rules fix for ARM ABI
James Molloy
james.molloy at arm.com
Wed Feb 26 10:24:18 PST 2014
================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:4859
@@ +4858,3 @@
+// NEON-BE: allow all neon vectors as well, since ld1/st1 must be disabled
+// LD1 & ST1 are not ABI conforming in big endian: wrong arg memory layout
+// http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
----------------
As the AAPCS is also not too clear on this, could you please add in the comment the *reason* ld1/st1 can't be used? More specifically than "wrong arg memory layout", it is because the LD1 performs lane-by-lane byte swapping, and LDR swaps the entire D/Q register.
================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:4878
@@ +4877,3 @@
+ dag Base, dag Offset, dag address> {
+ // float is always there
+ // shouldn't this be guarded by HasFPARMv8 ???
----------------
What does this mean? is it a paste from the comment below? is fp16 always available in a64? Also capitalization, as Tim said.
================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:4879
@@ +4878,3 @@
+ // float is always there
+ // shouldn't this be guarded by HasFPARMv8 ???
+ defm : ls_neutral_pats<LOAD, STORE, Base, Offset, address, f16>;
----------------
Please add a FIXME and remove trailing "??"
================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:4897
@@ +4896,3 @@
+ defm : ls_neutral_pats<LOAD, STORE, Base, Offset, address, v1i32>;
+// defm : ls_neutral_pats<LOAD, STORE, Base, Offset, address, v1f32>; does not exist - v1f64 DOES -- WHY ?
+ }
----------------
v1f64 is there because some instructions that act on f64's have both NEON and VFP variants, and in order that NEON intrinsics written by the user select the instruction that the user requested, there must be a way to distinguish between the two at the type level.
That is my understanding anyway - Hao or Jiangning will know more about this.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3383
@@ +3382,3 @@
+
+// LD1 disallowed in BE, when LDR and STR are used exclusively to save on 12bit offset adds.
+// reason: LDR/STR use different memory/register layout (no element swaps).
----------------
This line of the comment doesn't make sense. LD1 is disallowed, but it has nothing to do with 12bit offset adds! You give the reasoning below.
Also in LLVM we have a tendency to write comments in a fairly prose-like form to make it easier to read. For example, instead of "LD1 disallowed in BE", "LD1 is disallowed in BE mode". Also instead of "reason: ", "This is because..."
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3390
@@ +3389,3 @@
+
+// this should work in BE - single operand: no element swaps
+let Predicates = [IsLE] in {
----------------
The comment says this should work in BE mode... then the predicate stops it working in BE mode. Why?
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3461
@@ -3445,1 +3460,3 @@
+
// Store multiple 1-element structures from N consecutive registers (N = 2,3,4)
+ defm ST1x2 : STVList_BHSD<0b1010, "VPair", "st1">;
----------------
Failed to reindent this line?
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3888
@@ -3866,2 +3887,3 @@
+let Predicates = [IsLE] in {
// Load single 1-element structure to all lanes of 1 register
----------------
Shouldn't a splatting LD1 still work in BE mode?
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3908
@@ -3883,1 +3907,3 @@
+let Predicates = [IsLE] in {
+// element swap on bytes == byte swap
def : LD1R_pattern<v8i8, i32, extloadi8, LD1R_8B>;
----------------
This comment is cryptic. Also, why should we care about byte swapping here, we're under the IsLE predicate.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3931
@@ -3904,3 +3930,3 @@
-def : LD1R_pattern_v1<v1i64, i64, load, LD1R_1D>;
-def : LD1R_pattern_v1<v1f64, f64, load, LD1R_1D>;
+let Predicates = [IsLE] in {
+ def : LD1R_pattern_v1<v1i64, i64, load, LD1R_1D>;
----------------
ld1.64 should be fine too, right? Because ld1.64 acts the same as LDR (byte swapping on a 64-bit value).
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3996
@@ -3966,3 +3995,3 @@
-// Load single 1-element structure to one lane of 1 register.
-defm LD1LN : LDN_Lane_BHSD<0b0, 0b0, "VOne", "ld1">;
+let Predicates = [IsLE] in {
+ // Load single 1-element structure to one lane of 1 register.
----------------
Will 1-element to 1-lane also work in BE mode?
http://llvm-reviews.chandlerc.com/D2884
More information about the llvm-commits
mailing list