[llvm-commits] [PATCH] AVX vmovaps +vxoprs + vinsertf128 DAG combine to vmovaps

Chad Rosier mcrosier at apple.com
Wed Dec 21 18:12:38 PST 2011


This patch is for an AVX specific DAGcombine optimization. 

The following code:

__m256 foo(float *f) {
    return _mm256_castps128_ps256 (_mm_load_ps(f));
}

generates this assembly:

        vmovaps (%rdi), %xmm0
        vxorps  %ymm1, %ymm1, %ymm1
        vinsertf128     $0, %xmm0, %ymm1, %ymm0

On AVX enabled processors, the vmovaps will zero the upper bits (255:128) of the corresponding YMM register.  Therefore, the vxorps and vinsertf128 instructions are not necessary.

This patch implements a DAG combine that removes the unnecessary vxorps and vinsertf128 instructions.  Currently, this is only working as an enhancement to one of Bruno's DAGcombines (r135727), but I do plan on making this more general in the future.

 Chad

-------------- next part --------------
A non-text attachment was scrubbed...
Name: vzext_load128.patch
Type: application/octet-stream
Size: 4552 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111221/da6900e8/attachment.obj>


More information about the llvm-commits mailing list