[all-commits] [llvm/llvm-project] cff668: [X86] Lower the cost of v4i64->v4i32 and v8i64->v8...
topperc via All-commits
all-commits at lists.llvm.org
Wed Apr 29 13:23:18 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: cff66865322e9e990808eb5a7ed7cdacefb699d7
https://github.com/llvm/llvm-project/commit/cff66865322e9e990808eb5a7ed7cdacefb699d7
Author: Craig Topper <craig.topper at intel.com>
Date: 2020-04-29 (Wed, 29 Apr 2020)
Changed paths:
M llvm/lib/Target/X86/X86TargetTransformInfo.cpp
M llvm/test/Analysis/CostModel/X86/arith-fix.ll
M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
M llvm/test/Analysis/CostModel/X86/cast.ll
M llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll
M llvm/test/Analysis/CostModel/X86/trunc.ll
Log Message:
-----------
[X86] Lower the cost of v4i64->v4i32 and v8i64->v8i32 truncate with AVX
We generate much better code these days than we used to. And we use the same sequence for AVX1 and AVX2 for these
For v4i64->v4i32 we generate:
vextractf128 xmm1, ymm0, 1
vshufps xmm0, xmm0, xmm1, 136 # xmm0 = xmm0[0,2],xmm1[0,2]
And for v8i64->v8i32 we generate:
vperm2f128 ymm2, ymm0, ymm1, 49 # ymm2 = ymm0[2,3],ymm1[2,3]
vinsertf128 ymm0, ymm0, xmm1, 1
vshufps ymm0, ymm0, ymm2, 136 # ymm0 = ymm0[0,2],ymm2[0,2],ymm0[4,6],ymm2[4,6]
Differential Revision: https://reviews.llvm.org/D79109
More information about the All-commits
mailing list