[llvm-commits] [PATCH] AVX vmovaps +vxoprs + vinsertf128 DAG combine to vmovaps
Chad Rosier
mcrosier at apple.com
Wed Dec 21 18:12:38 PST 2011
This patch is for an AVX specific DAGcombine optimization.
The following code:
__m256 foo(float *f) {
return _mm256_castps128_ps256 (_mm_load_ps(f));
}
generates this assembly:
vmovaps (%rdi), %xmm0
vxorps %ymm1, %ymm1, %ymm1
vinsertf128 $0, %xmm0, %ymm1, %ymm0
On AVX enabled processors, the vmovaps will zero the upper bits (255:128) of the corresponding YMM register. Therefore, the vxorps and vinsertf128 instructions are not necessary.
This patch implements a DAG combine that removes the unnecessary vxorps and vinsertf128 instructions. Currently, this is only working as an enhancement to one of Bruno's DAGcombines (r135727), but I do plan on making this more general in the future.
Chad
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vzext_load128.patch
Type: application/octet-stream
Size: 4552 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111221/da6900e8/attachment.obj>
More information about the llvm-commits
mailing list