[PATCH][AArch64] implement 3 aarch64 neon instrunctions (umov smov ins) in llvm

Mon Sep 2 08:26:13 PDT 2013

Hi Kevin,

Some of these choices may have good reasons, but looking through the
code I've got some questions:

> To avoid legalizing signed-extend and zero-extend to 'and' and
> 'sign_extend_inreg', we combined them with extractelement in early combine
> stage, presenting as 'Neon_vector_extract' node.

I'm not sure I understand this bit. Why do we need to combine these
instead of directly writing patterns like:

def : Pat<(sext_inreg (extract_elt v8i8), i8), (SMOV

+  if(AArch64::FPR64RegClass.contains(SReg)) {
+    return SReg + AArch64::V0 - AArch64::D0;
+  }

We shouldn't be relying on these enum orders unless we have to. In
this case MCRegisterInfo::getMatchingSuperReg is probably what you
want.

+      if(getSubTarget().hasNEON())
+        [...]
+      else

What's the advantage of emitting  a different instruction if the core
has NEON here? The normal FP one should work regardless (and may
actually be faster since the lane doesn't have to be decoded)

+def Neon_vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT", SDT_Neon_mov_lane>;
+def Neon_vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT", SDTypeProfile<1, 3,
+                           [SDTCisVec<0>, SDTCisSameAs<0, 1>,
+                           SDTCisInt<2>, SDTCisVT<3, i32>]>>;

What's wrong with the existing extractelt and insertelt?

+//def neon_uimm3_asmoperand : AsmOperandClass

Commented out code.

+def neon_uimm0_chomp_hash : Operand<i32>

Not convinced on the name here. "Chomp" implies an active removal,
which isn't what's happening here: it should never be there in the
first place.

I think I used the adjective "bare" for the single place I came across
it before ("fmov xD, vN.d[1]" isn't actually a NEON instruction). But
I'm not wedded to that one either.

+               let Inst{14-11} = {Immn{3}, Immn{2}, Immn{1}, Immn{0}};

That's just Immn isn't it? Though I do quite like the representation
in the Immd case; nice idea!

+               let Inst{14-12} = {Immn{2}, Immn{1}, Immn{0}};

What's bit 11? Same for later ones. If it's set elsewhere, a comment
would be useful. If it's not then setting it is probably essential.

+    def _16B16 : NeonI_insert<0b1, 0b1,
+    [...]
+    def _16B8 : NeonI_insert<0b1, 0b1,

Why are these separate instructions? It seems to be some attempt to
model the VPR64/VPR128 distinction, but I don't think it actually
gains us anything. It's not like the high part of VPR128 can be used
independently really.

+             (outs VPR128:$Rd), (ins VPR128:$Rn, VPR128:$src,
+              neon_uimm4_chomp_hash:$Immd, neon_uimm4_chomp_hash:$Immn),

I think this would be more natural if the operands followed the order
in the assembly syntax.

+             [(set (v16i8 VPR128:$Rd), (v16i8 (Neon_vector_insert
+              (v16i8 VPR128:$src), (i32 (Neon_vector_extract
+              (v16i8 VPR128:$Rn), (neon_uimm4_chomp_hash:$Immn))),

The formatting makes that rather difficult to parse.

Cheers.

Tim.