[PATCH] Implement aarch64 neon instruction class SIMD copy - LLVM
Silviu Baranga
silviu.baranga at gmail.com
Tue Oct 8 06:57:56 PDT 2013
================
Comment at: lib/Target/AArch64/AArch64InstrNEON.td:4512
@@ -4456,7 +4511,3 @@
-defm SMOVxb_pattern : Neon_SMOVx_pattern<v16i8, v8i8, i8, neon_uimm4_bare,
- neon_uimm3_bare, SMOVxb>;
-defm SMOVxh_pattern : Neon_SMOVx_pattern<v8i16, v4i16, i16, neon_uimm3_bare,
- neon_uimm2_bare, SMOVxh>;
-defm SMOVxs_pattern : Neon_SMOVx_pattern<v4i32, v2i32, i32, neon_uimm2_bare,
- neon_uimm1_bare, SMOVxs>;
+defm : Neon_SMOVx_pattern<v16i8, v8i8, i8, neon_uimm4_bare,
+ neon_uimm3_bare, SMOVxb>;
----------------
Tim Northover wrote:
> James Molloy wrote:
> > These pattern match SMOV nodes - would it not be better to pattern match (sext (COPY ) ) instead? That way the copy coalescer could be more effective. Same for UMOV nodes (zext (COPY ) )
> They appear to be copies from lanes, and I don't think you can write a COPY node that does that.
>
> (Well, unless we added sub8_0, ..., sub8_15 indices, and even if you could make that work I'd worry LLVM would decide to start tracking v0.16b[4] separately from v0.16b[5] causing nasty partial register dependencies)
Also the zext node can't be in the pattern output and I can't see any good way this can be expressed in terms of copies. I think the current implementation is the best solution for now.
http://llvm-reviews.chandlerc.com/D1854
More information about the llvm-commits
mailing list