[all-commits] [llvm/llvm-project] 3c050a: [CostModel] fix cost calc bug for sadd/ssub with o...
RotateRight via All-commits
all-commits at lists.llvm.org
Tue Nov 3 08:05:12 PST 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 3c050a597c599cea035537b8875774dcc48922c3
https://github.com/llvm/llvm-project/commit/3c050a597c599cea035537b8875774dcc48922c3
Author: Sanjay Patel <spatel at rotateright.com>
Date: 2020-11-03 (Tue, 03 Nov 2020)
Changed paths:
M llvm/include/llvm/CodeGen/BasicTTIImpl.h
M llvm/test/Analysis/CostModel/X86/arith-overflow.ll
M llvm/test/Analysis/CostModel/X86/arith-ssat.ll
Log Message:
-----------
[CostModel] fix cost calc bug for sadd/ssub with overflow
As noted in D90554, there's an opcode typo in using an easily
misused cost model API: getCmpSelInstrCost(). Beyond that, the
assumed sequence of ops is questionable, but that would be
another patch.
My guess is that the x86 test diffs show that we are probably
wrong both before and after this change, so there will be no
practical difference.
As an example, I tried this test which shows a cost of '7'
either way:
define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) {
%V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb)
%ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1
%r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0
%z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r
ret <4 x i32> %z
}
$ llc -o - sadd.ll -mattr=avx
vpaddd %xmm1, %xmm0, %xmm2
vpcmpgtd %xmm2, %xmm0, %xmm0
vpxor %xmm0, %xmm1, %xmm0
vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a
Differential Revision: https://reviews.llvm.org/D90681
More information about the All-commits
mailing list