[PATCH] Implement aarch64 neon instruction class AdvSIMD (by element) - LLVM
Tim Northover
t.p.northover at gmail.com
Thu Sep 26 09:00:22 PDT 2013
Hi Jiangning,
Quite a lot of comments, I'm afraid, but mostly just me moaning about names (which probably makes it worse). I think the important ones are:
1. Why are NEON_DUPIMM and NEON_VDUP separate? Actually, why is NEON_VDUP even in this patch since you don't use it?
2. I don't think we should be forming fma without fp-contract=fast but you seem to be checking it in your tests. That's worrying.
Cheers.
Tim.
================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:1017-1018
@@ +1016,4 @@
+ list<dag> patterns, InstrItinClass itin>
+ : A64InstRdnm<outs, ins, asmstr, patterns, itin>
+{
+ let Inst{31} = 0b0;
----------------
Braces normally on the same line.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:1569-1572
@@ -1564,3 +1569,5 @@
def Neon_top4S : PatFrag<(ops node:$in),
(extract_subvector (v4i32 node:$in), (iPTR 2))>;
+def Neon_low8H : PatFrag<(ops node:$in),
+ (v4i16 (extract_subvector (v8i16 node:$in),
----------------
The antonym of "top" is "bottom". I'm not overly struck on either, but we should probably try to be consistent.
================
Comment at: lib/Target/AArch64/AArch64ISelLowering.h:144-145
@@ +143,4 @@
+
+ // Vector dup
+ NEON_VDUP,
+
----------------
I think this has semantics identical to the existing NEON_DUPIMM doesn't it? A large part of the reason I pushed for that solution was that it could be shared in these instructions.
NEON_DUP is probably the best name here. "NEON" already implies that a duplicate will involve vectors and AArch64 instructions don't have a 'V' prefix.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3286
@@ +3285,3 @@
+
+multiclass NI_2VE_v1<bit u, bits<4> opcode, string asmop>
+{
----------------
(Not a comment on this patch, I don't have a good solution. But we really need think about what to do with these instruction format names. Perhaps pass them through MD5 to make them more readable or something.)
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3443-3444
@@ +3442,4 @@
+
+// Pattern for lane with q
+class NI_2VE_mul_laneq<Instruction INST, Operand OpImm, SDPatternOperator op,
+ RegisterOperand OpVPR, RegisterOperand EleOpVPR,
----------------
'q' means nothing on AArch64 NEON. It's like writing "pattern for lane with ymm".
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3662
@@ +3661,3 @@
+// The followings are patterns using fma
+// -ffp-contract=fast enforces to generate fma
+
----------------
Perhaps "-ffp-contract=fast generates fma". "enforces to generate" doesn't work in English.
================
Comment at: test/CodeGen/AArch64/neon-2velem.ll:354-356
@@ +353,5 @@
+entry:
+ %shuffle = shufflevector <2 x float> %v, <2 x float> undef, <2 x i32> <i32 1, i32 1>
+ %mul = fmul <2 x float> %shuffle, %b
+ %add = fadd <2 x float> %mul, %a
+ ret <2 x float> %add
----------------
None of these should be forming an fmla without fp-contract=fast, should they?
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3275
@@ +3274,3 @@
+ (outs ResVPR:$Rd), (ins ResVPR:$src, OpVPR:$Rn,
+ EleOpVPR:$Re, OpImm:$Index),
+ asmop # "\t$Rd." # ResS # ", $Rn." # OpS #
----------------
The documentation still uses "Rm" for this register, doesn't it? Inventing "Re" seems confusing.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:3986
@@ +3985,3 @@
+
+multiclass NI_qdfma<SDPatternOperator op>
+{
----------------
What's this got to do with an fma? The "f" stands for either "float" or "fused", neither of which applies here.
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:4000
@@ +3999,3 @@
+
+multiclass NI_2VEL_v3_qdfma_pat<string subop, string op>
+{
----------------
Likewise
http://llvm-reviews.chandlerc.com/D1753
More information about the cfe-commits
mailing list