[PATCH] D24681: Optimize patterns of vectorized interleaved memory accesses for X86.
David Kreitzer via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 6 07:38:52 PDT 2016
DavidKreitzer added a comment.
Hi Farhana,
Aside from one minor issue in the test, this looks great.
-Dave
> x86-interleaved-access.ll:25
> +; AVX-NEXT: vperm2f128 {{.*#+}} ymm1 = ymm1[2,3],ymm3[2,3]
> +; AVX-NEXT: vunpcklpd {{.*#+}} ymm2 = ymm4[0],ymm5[0],ymm4[2],ymm5[2]
> +define <4 x double> @load_factorf64_2(<16 x double>* %ptr) {
There should be a vunpckhpd here too, right? Is there a reason you are not checking for it?
https://reviews.llvm.org/D24681
More information about the llvm-commits
mailing list