[PATCH] D32993: DAGCombine: Extend createBuildVecShuffle for case len(in_vec) = 4*len(result_vec)
Zvi Rackover via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 9 01:05:38 PDT 2017
zvi added inline comments.
================
Comment at: test/CodeGen/ARM/vpadd.ll:376
; CHECK-NEXT: vmovl.u8 q8, d16
-; CHECK-NEXT: vpadd.i16 d16, d16, d17
+; CHECK-NEXT: vuzp.16 q8, q9
+; CHECK-NEXT: vadd.i16 d16, d16, d18
----------------
This appears to be a regression for ARM codegen. Assuming it is, what the options for fixing it? IMHO these are the options ordered by preference:
1. Can we improve the ARM backend to handle this case?
2. Add a TLI hook for deciding when insert-extract sequences are better than composed shuffle?
3. Do this only in the X86 lowering.
https://reviews.llvm.org/D32993
More information about the llvm-commits
mailing list