[PATCH] [X86, AVX] improve insertion into zero element of 256-bit vector
Sanjay Patel
spatel at rotateright.com
Wed Mar 25 10:06:37 PDT 2015
Hi andreadb, qcolombet, RKSimon,
This patch allows AVX blend instructions to handle insertion into the low element of a 256-bit vector for the appropriate data types.
For f32, instead of:
vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]
we get:
vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]
For f64, instead of:
vmovsd %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1]
vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]
we get:
vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]
For the hardware-neglected integer data types, I left a TODO comment in the code and added regression tests for a follow-on patch.
http://reviews.llvm.org/D8609
Files:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/avx-insertelt.ll
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D8609.22655.patch
Type: text/x-patch
Size: 4335 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150325/e41e74f4/attachment.bin>
More information about the llvm-commits
mailing list