[PATCH] Implement aarch64 neon instruction class AdvSIMD (by element) - LLVM

Jiangning Liu liujiangning1 at gmail.com
Fri Sep 27 03:00:35 PDT 2013


  Hi Tim,

  Overall, I accept almost all of your comments. Particularly, I removed the patterns for generating fma when -ffp-contract=fast is unavailable. I also removed Neon_DUP from my patch.

  For register definition 'Re', I didn't make change. Re is different from Rm and the Rm in documentation for this instruction class is different from the Rm defined for A64Inst. Rm is being defined in llvm as Inst{20-16}, while Re is not, because Inst{20} is being shared for both Rm and bit 'M'. This is also why the index is restricted to V0-V15 for vector with H element, I think.

  Thanks,
  -Jiangning


================
Comment at: lib/Target/AArch64/AArch64ISelLowering.h:144-145
@@ +143,4 @@
+
+    // Vector dup
+    NEON_VDUP,
+
----------------
Tim Northover wrote:
> I think this has semantics identical to the existing NEON_DUPIMM doesn't it? A large part of the reason I pushed for that solution was that it could be shared in these instructions.
> 
> NEON_DUP is probably the best name here. "NEON" already implies that a duplicate will involve vectors and AArch64 instructions don't have a 'V' prefix.
OK. I will remove NEON_VDUP, and Kevin will add it back in his new patch.

================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:1017-1018
@@ +1016,4 @@
+                   list<dag> patterns, InstrItinClass itin>
+  : A64InstRdnm<outs, ins, asmstr, patterns, itin>
+{
+  let Inst{31} = 0b0;
----------------
Tim Northover wrote:
> Braces normally on the same line.
OK. I will reformat this one as well as similar ones.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:1569-1572
@@ -1564,3 +1569,5 @@
 def Neon_top4S : PatFrag<(ops node:$in),
                          (extract_subvector (v4i32 node:$in), (iPTR 2))>;
 
+def Neon_low8H : PatFrag<(ops node:$in),
+                         (v4i16 (extract_subvector (v8i16 node:$in),
----------------
Tim Northover wrote:
> The antonym of "top" is "bottom". I'm not overly struck on either, but we should probably try to be consistent.
OK. I will not use bottom, because it's longer. I will use High instead.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3275
@@ +3274,3 @@
+                 (outs ResVPR:$Rd), (ins ResVPR:$src, OpVPR:$Rn,
+                                         EleOpVPR:$Re, OpImm:$Index),
+                 asmop # "\t$Rd." # ResS # ", $Rn." # OpS #
----------------
Tim Northover wrote:
> The documentation still uses "Rm" for this register, doesn't it? Inventing "Re" seems confusing.
Re is different from Rm and the Rm in documentation for this instruction class is different from the Rm defined for A64Inst. Rm is being defined in llvm as Inst{20-16}, while Re is not, because Inst{20} is being shared for both Rm and bit 'M'. This is also why the index is restricted to V0-V15 for vector with H element, I think. 

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3286
@@ +3285,3 @@
+
+multiclass NI_2VE_v1<bit u, bits<4> opcode, string asmop>
+{
----------------
Tim Northover wrote:
> (Not a comment on this patch, I don't have a good solution. But we really need think about what to do with these instruction format names. Perhaps pass them through MD5 to make them more readable or something.)
I don't have solution either. I'm trying to make differences for different patterns by describing 1) argument types 2) encoding variants. The encoding variants are purely caused by encoding limitation, so it's really hard to give an even meaningful name.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3443-3444
@@ +3442,4 @@
+
+// Pattern for lane with q
+class NI_2VE_mul_laneq<Instruction INST, Operand OpImm, SDPatternOperator op,
+                       RegisterOperand OpVPR, RegisterOperand EleOpVPR,
----------------
Tim Northover wrote:
> 'q' means nothing on AArch64 NEON. It's like writing "pattern for lane with ymm".
OK. I will change "without q' to 'in 64-bit vector', and change 'with q' to '128-bit vector'.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3662
@@ +3661,3 @@
+// The followings are patterns using fma
+// -ffp-contract=fast enforces to generate fma
+
----------------
Tim Northover wrote:
> Perhaps "-ffp-contract=fast generates fma". "enforces to generate" doesn't work in English.
OK. I thought I wrote 'force' rather than 'enforce'. :-) Anyway, I will change to the wording you gave.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3986
@@ +3985,3 @@
+
+multiclass NI_qdfma<SDPatternOperator op>
+{
----------------
Tim Northover wrote:
> What's this got to do with an fma? The "f" stands for either "float" or "fused", neither of which applies here.
OK. I will remove letter 'f'.

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:4000
@@ +3999,3 @@
+
+multiclass NI_2VEL_v3_qdfma_pat<string subop, string op>
+{
----------------
Tim Northover wrote:
> Likewise
Likewise

================
Comment at: test/CodeGen/AArch64/neon-2velem.ll:354-356
@@ +353,5 @@
+entry:
+  %shuffle = shufflevector <2 x float> %v, <2 x float> undef, <2 x i32> <i32 1, i32 1>
+  %mul = fmul <2 x float> %shuffle, %b
+  %add = fadd <2 x float> %mul, %a
+  ret <2 x float> %add
----------------
Tim Northover wrote:
> None of these should be forming an fmla without fp-contract=fast, should they?
Yes. I will remove the test without -ffp-contract=fast.


http://llvm-reviews.chandlerc.com/D1753



More information about the cfe-commits mailing list