[PATCH] D63364: [x86] split 256-bit vector selects if operands are vector concats
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Jun 15 05:54:06 PDT 2019
RKSimon accepted this revision.
RKSimon added a comment.
This revision is now accepted and ready to land.
LGTM but there are a couple of cases that are bordering on regression that need investigating (llvm-mca comparisons, TODO comments, bug report, whatever).
@lebedev.ri The TTI costs try to include the extra costs of 256-bit integer vector ops for AVX1 but its often tricky to completely account for it - because the costs work on an individual instruction level many of the 'holistic' effects aren't considered at all. This is something that has made it difficult to make D46276 <https://reviews.llvm.org/D46276> actually useful - slightly better costs for individual instructions didn't help improve costs/codgen decisions for the entire sequence.
================
Comment at: llvm/test/CodeGen/X86/cast-vsel.ll:494
+; AVX1-NEXT: vmovaps %xmm4, dj+4112(%rax)
+; AVX1-NEXT: vmovaps %xmm5, dj+4096(%rax)
; AVX1-NEXT: addq $32, %rax
----------------
This is a annoying - even though many AVX1 targets have 128-bit ALUs, we were avoiding xmm insertion/extraction completely which was the better option.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63364/new/
https://reviews.llvm.org/D63364
More information about the llvm-commits
mailing list