[all-commits] [llvm/llvm-project] bdbbed: [X86][CostModel] Update costs for vector truncate ...
topperc via All-commits
all-commits at lists.llvm.org
Mon Apr 27 12:01:04 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: bdbbed115f87fd2700bf10249c6a63625f59a809
https://github.com/llvm/llvm-project/commit/bdbbed115f87fd2700bf10249c6a63625f59a809
Author: Craig Topper <craig.topper at intel.com>
Date: 2020-04-27 (Mon, 27 Apr 2020)
Changed paths:
M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
M llvm/test/Analysis/CostModel/X86/arith-fix.ll
M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
M llvm/test/Analysis/CostModel/X86/cast.ll
M llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll
M llvm/test/Analysis/CostModel/X86/trunc.ll
Log Message:
-----------
[X86][CostModel] Update costs for vector truncate with avx512f/avx512bw.
All avx512 truncate instructions except vXi64->vXi32 are 2 uops
on port 5. So raise their costs to 2. Except when we have an
earlier faster sequence like pshufb for 128 bit input vectors.
Add a lower cost of 3 v16i16->v16i8 with avx512f where we can
extend to v16i32 then truncate. And a cost of 2 for avx512bw with
and without avx512vl. There we can use vpmovwb with either a ymm
or zmm input. Both of these beat masking, splitting, and using
packuswb which is our avx/avx2 codegen.
More information about the All-commits
mailing list