[PATCH] generate extract_subvector node to avoid disastrous shuffle vector codegen
Sanjay Patel
spatel at rotateright.com
Thu Dec 11 11:27:06 PST 2014
Hi mkuper, chandlerc, andreadb, hfinkel,
This is a partial fix for PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ). When we extract multiple consecutive elements from a vector to create a build_vector, we should try to form an extract_subvector instead of relying solely on getVectorShuffle().
The difference in output for the simplest v4f64 test case looks like this:
vextractf128 $1, %ymm0, %xmm0
vpermilpd $1, %xmm0, %xmm1 ## xmm1 = xmm0[1,0]
vunpcklpd %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm1[0]
vmovapd %xmm0, (%rdi)
vzeroupper
retq
Becomes:
vextractf128 $1, %ymm0, (%rdi)
vzeroupper
retq
We should still fix the shuffle problem in the x86 backend, but I thought it was best to solve the higher-level problem first. There's also a bug in the x86 backend dealing with arbitrary indexing and lowering the EXTRACT_SUBVECTOR node, so I've limited this patch to firing on the (most common?) case of half-vector extractions. This pattern emerges in particular on SandyBridge because it cracks 32-byte memops in half causing mismatches in vector sizes.
http://reviews.llvm.org/D6622
Files:
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
test/CodeGen/X86/vec_extract-avx.ll
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D6622.17183.patch
Type: text/x-patch
Size: 6783 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141211/3d1d4f7c/attachment.bin>
More information about the llvm-commits
mailing list