[PATCH] Implement aarch64 neon instruction class SIMD copy - LLVM

Tim Northover t.p.northover at gmail.com
Tue Oct 8 06:33:10 PDT 2013


  Hi Kevin,

  Mostly I agree with James's comments. Just one extra and one justification.


================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:869-870
@@ -868,4 +879,2 @@
     return "AArch64ISD::NEON_TST";
-  case AArch64ISD::NEON_DUPIMM:
-    return "AArch64ISD::NEON_DUPIMM";
   case AArch64ISD::NEON_QSHLs:
----------------
You should add code to print out a NEON_VDUP node

================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:4512
@@ -4456,7 +4511,3 @@
 
-defm SMOVxb_pattern : Neon_SMOVx_pattern<v16i8, v8i8, i8, neon_uimm4_bare,
-                                          neon_uimm3_bare, SMOVxb>;
-defm SMOVxh_pattern : Neon_SMOVx_pattern<v8i16, v4i16, i16, neon_uimm3_bare,
-                                          neon_uimm2_bare, SMOVxh>;
-defm SMOVxs_pattern : Neon_SMOVx_pattern<v4i32, v2i32, i32, neon_uimm2_bare,
-                                          neon_uimm1_bare, SMOVxs>;
+defm : Neon_SMOVx_pattern<v16i8, v8i8, i8, neon_uimm4_bare,
+                          neon_uimm3_bare, SMOVxb>;
----------------
James Molloy wrote:
> These pattern match SMOV nodes - would it not be better to pattern match (sext (COPY ) ) instead? That way the copy coalescer could be more effective. Same for UMOV nodes (zext (COPY ) )
They appear to be copies from lanes, and I don't think you can write a COPY node that does that.

(Well, unless we added sub8_0, ..., sub8_15 indices, and even if you could make that work I'd worry LLVM would decide to start tracking v0.16b[4] separately from v0.16b[5] causing nasty partial register dependencies)


http://llvm-reviews.chandlerc.com/D1854



More information about the llvm-commits mailing list