[all-commits] [llvm/llvm-project] 36e3c6: [X86][AVX] Truncate vectors with PACKSS/PACKUS on ...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Thu Mar 25 03:35:16 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 36e3c6c841eb7afa417fea4f3357c48cd1bf0583
https://github.com/llvm/llvm-project/commit/36e3c6c841eb7afa417fea4f3357c48cd1bf0583
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2021-03-25 (Thu, 25 Mar 2021)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/masked_store_trunc.ll
M llvm/test/CodeGen/X86/psubus.ll
M llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
M llvm/test/CodeGen/X86/vector-reduce-or-bool.ll
M llvm/test/CodeGen/X86/vector-reduce-xor-bool.ll
M llvm/test/CodeGen/X86/vector-trunc-math.ll
M llvm/test/CodeGen/X86/vector-trunc.ll
Log Message:
-----------
[X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets
Until AVX512 we don't have any vector truncation instructions, and always lower using shuffles instead.
combineVectorTruncation performs this earlier than lowering as it makes it easier to use any sign/zero-extended bits in the truncated bits with PACKSS/PACKUS to perform the shuffle.
We currently don't attempt to use combineVectorTruncation on AVX2 targets as in the past 256-bit PACKSS/PACKUS tended to cause 128-bit lane shuffle regressions - but these should now be all resolved with combineHorizOpWithShuffle and in all cases we now reduce the amount of cross-lane shuffling and variable shuffle mask usage.
Differential Revision: https://reviews.llvm.org/D96609
More information about the All-commits
mailing list