[PATCH] D24681: Optimize patterns of vectorized interleaved memory accesses for X86.

Thu Oct 6 07:38:52 PDT 2016

DavidKreitzer added a comment.

Hi Farhana,

Aside from one minor issue in the test, this looks great.

-Dave

> x86-interleaved-access.ll:25
> +; AVX-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm1[2,3],ymm3[2,3]
> +; AVX-NEXT: vunpcklpd {{.*#+}} ymm2 = ymm4[0],ymm5[0],ymm4[2],ymm5[2]
> +define <4 x double> @load_factorf64_2(<16 x double>* %ptr) {

There should be a vunpckhpd here too, right? Is there a reason you are not checking for it?

https://reviews.llvm.org/D24681