[PATCH] D29370: [X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 11 05:51:28 PST 2017


RKSimon added inline comments.


================
Comment at: test/CodeGen/X86/x86-interleaved-access.ll:78
+; AVX2-NEXT:    vmulpd %ymm0, %ymm0, %ymm0
+; AVX2-NEXT:    retq
   %wide.vec = load <16 x double>, <16 x double>* %ptr, align 16
----------------
craig.topper wrote:
> RKSimon wrote:
> > Annoying but not a real problem.
> We should really look into shrinking those loads
Agreed, I'm all for splitting ymm (+ zmm?) loads if the only thing that happens to them is some/all their subvectors get extracted - its a definite win for Jaguar and probably all AVX1 targets - I don't think its even necessary for performance to ensure that the split loads fold? Element extraction is a bit more tricky - there are cases where its useful.


https://reviews.llvm.org/D29370





More information about the llvm-commits mailing list