<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Philip,</div><div>Thanks for letting me know. For reference, here are possibly relevant changes to CGP that came after this commit:</div><div><a href="https://reviews.llvm.org/D58995" target="_blank">https://reviews.llvm.org/D58995</a> (rL355512)</div><div><a href="https://reviews.llvm.org/D59139" target="_blank">https://reviews.llvm.org/D59139</a> (rL355751)<br></div><div><a href="https://reviews.llvm.org/D59696" target="_blank">https://reviews.llvm.org/D59696</a> (rL356937)</div><div><a href="https://reviews.llvm.org/D59889" target="_blank">https://reviews.llvm.org/D59889</a> (rL357111)<br></div><div><br></div></div></div></div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 23, 2019 at 11:13 AM Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Sanjay,</p>
<p>We are also seeing fall out from this change. We have a
relatively widely felt compile time regression which appears to be
triggered by this change. The operating theory I've heard is that
the use of dom tree is forcing many more rebuilds of a previously
invalidated tree. Yevgeny (CCd) can provide more information;
he's worked around the problem in our downstream tree and can
share his analysis. <br>
</p>
<p>Philip<br>
</p>
<div class="gmail-m_-8983031789758981807gmail-m_8370979458647442368moz-cite-prefix">On 3/13/19 9:54 PM, Teresa Johnson via
llvm-commits wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">Hi Sanjay,
<div><br>
</div>
<div>Unfortunately we are having some additional problems with
this patch. One is a compiler assertion (which goes away
after r355823 although since that patch just added a
heuristic guard on the transformation it is likely just
hidden). I filed <a href="https://bugs.llvm.org/show_bug.cgi?id=41064" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=41064</a>
for that one.</div>
<div><br>
</div>
<div>The other is a performance slowdown. Carrot who is copied
here can send you more info about that.</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Teresa</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Feb 18, 2019 at 3:32
PM Sanjay Patel via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Author:
spatel<br>
Date: Mon Feb 18 15:33:05 2019<br>
New Revision: 354298<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=354298&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=354298&view=rev</a><br>
Log:<br>
[CGP] form usub with overflow from sub+icmp<br>
<br>
The motivating x86 cases for forming the intrinsic are shown
in PR31754 and PR40487:<br>
<a href="https://bugs.llvm.org/show_bug.cgi?id=31754" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=31754</a><br>
<a href="https://bugs.llvm.org/show_bug.cgi?id=40487" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=40487</a><br>
..and those are shown in the IR test file and x86 codegen
file.<br>
<br>
Matching the usubo pattern is harder than uaddo because we
have 2 independent values rather than a def-use.<br>
<br>
This adds a TLI hook that should preserve the existing
behavior for uaddo formation, but disables usubo<br>
formation by default. Only x86 overrides that setting for now
although other targets will likely benefit<br>
by forming usbuo too.<br>
<br>
Differential Revision: <a href="https://reviews.llvm.org/D57789" rel="noreferrer" target="_blank">https://reviews.llvm.org/D57789</a><br>
<br>
Modified:<br>
llvm/trunk/include/llvm/CodeGen/TargetLowering.h<br>
llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp<br>
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
llvm/trunk/lib/Target/X86/X86ISelLowering.h<br>
llvm/trunk/test/CodeGen/X86/cgp-usubo.ll<br>
llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll<br>
llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll<br>
<br>
Modified: llvm/trunk/include/llvm/CodeGen/TargetLowering.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/TargetLowering.h?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/TargetLowering.h?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/include/llvm/CodeGen/TargetLowering.h
(original)<br>
+++ llvm/trunk/include/llvm/CodeGen/TargetLowering.h Mon Feb
18 15:33:05 2019<br>
@@ -2439,6 +2439,23 @@ public:<br>
return false;<br>
}<br>
<br>
+ /// Try to convert math with an overflow comparison into
the corresponding DAG<br>
+ /// node operation. Targets may want to override this
independently of whether<br>
+ /// the operation is legal/custom for the given type
because it may obscure<br>
+ /// matching of other patterns.<br>
+ virtual bool shouldFormOverflowOp(unsigned Opcode, EVT VT)
const {<br>
+ // TODO: The default logic is inherited from code in
CodeGenPrepare.<br>
+ // The opcode should not make a difference by default?<br>
+ if (Opcode != ISD::UADDO)<br>
+ return false;<br>
+<br>
+ // Allow the transform as long as we have an integer type
that is not<br>
+ // obviously illegal and unsupported.<br>
+ if (VT.isVector())<br>
+ return false;<br>
+ return VT.isSimple() || !isOperationExpand(Opcode, VT);<br>
+ }<br>
+<br>
// Return true if it is profitable to use a scalar input to
a BUILD_VECTOR<br>
// even if the vector itself has multiple uses.<br>
virtual bool aggressivelyPreferBuildVectorSources(EVT
VecVT) const {<br>
<br>
Modified: llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp (original)<br>
+++ llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp Mon Feb 18
15:33:05 2019<br>
@@ -1162,9 +1162,18 @@ static bool
OptimizeNoopCopyExpression(C<br>
static void replaceMathCmpWithIntrinsic(BinaryOperator *BO,
CmpInst *Cmp,<br>
Instruction
*InsertPt,<br>
Intrinsic::ID IID) {<br>
+ Value *Arg0 = BO->getOperand(0);<br>
+ Value *Arg1 = BO->getOperand(1);<br>
+<br>
+ // We allow matching the canonical IR (add X, C) back to
(usubo X, -C).<br>
+ if (BO->getOpcode() == Instruction::Add &&<br>
+ IID == Intrinsic::usub_with_overflow) {<br>
+ assert(isa<Constant>(Arg1) && "Unexpected
input for usubo");<br>
+ Arg1 = ConstantExpr::getNeg(cast<Constant>(Arg1));<br>
+ }<br>
+<br>
IRBuilder<> Builder(InsertPt);<br>
- Value *MathOV = Builder.CreateBinaryIntrinsic(IID,
BO->getOperand(0),<br>
-
BO->getOperand(1));<br>
+ Value *MathOV = Builder.CreateBinaryIntrinsic(IID, Arg0,
Arg1);<br>
Value *Math = Builder.CreateExtractValue(MathOV, 0,
"math");<br>
Value *OV = Builder.CreateExtractValue(MathOV, 1, "ov");<br>
BO->replaceAllUsesWith(Math);<br>
@@ -1182,13 +1191,8 @@ static bool
combineToUAddWithOverflow(Cm<br>
if (!match(Cmp, m_UAddWithOverflow(m_Value(A), m_Value(B),
m_BinOp(Add))))<br>
return false;<br>
<br>
- // Allow the transform as long as we have an integer type
that is not<br>
- // obviously illegal and unsupported.<br>
- Type *Ty = Add->getType();<br>
- if (!isa<IntegerType>(Ty))<br>
- return false;<br>
- EVT CodegenVT = TLI.getValueType(DL, Ty);<br>
- if (!CodegenVT.isSimple() &&
TLI.isOperationExpand(ISD::UADDO, CodegenVT))<br>
+ if (!TLI.shouldFormOverflowOp(ISD::UADDO,<br>
+ TLI.getValueType(DL,
Add->getType())))<br>
return false;<br>
<br>
// We don't want to move around uses of condition values
this late, so we<br>
@@ -1210,6 +1214,64 @@ static bool
combineToUAddWithOverflow(Cm<br>
return true;<br>
}<br>
<br>
+static bool combineToUSubWithOverflow(CmpInst *Cmp, const
TargetLowering &TLI,<br>
+ const DataLayout
&DL, bool &ModifiedDT) {<br>
+ // Convert (A u> B) to (A u< B) to simplify pattern
matching.<br>
+ Value *A = Cmp->getOperand(0), *B =
Cmp->getOperand(1);<br>
+ ICmpInst::Predicate Pred = Cmp->getPredicate();<br>
+ if (Pred == ICmpInst::ICMP_UGT) {<br>
+ std::swap(A, B);<br>
+ Pred = ICmpInst::ICMP_ULT;<br>
+ }<br>
+ // Convert special-case: (A == 0) is the same as (A u<
1).<br>
+ if (Pred == ICmpInst::ICMP_EQ && match(B,
m_ZeroInt())) {<br>
+ B = ConstantInt::get(B->getType(), 1);<br>
+ Pred = ICmpInst::ICMP_ULT;<br>
+ }<br>
+ if (Pred != ICmpInst::ICMP_ULT)<br>
+ return false;<br>
+<br>
+ // Walk the users of a variable operand of a compare
looking for a subtract or<br>
+ // add with that same operand. Also match the 2nd operand
of the compare to<br>
+ // the add/sub, but that may be a negated constant operand
of an add.<br>
+ Value *CmpVariableOperand = isa<Constant>(A) ? B : A;<br>
+ BinaryOperator *Sub = nullptr;<br>
+ for (User *U : CmpVariableOperand->users()) {<br>
+ // A - B, A u< B --> usubo(A, B)<br>
+ if (match(U, m_Sub(m_Specific(A), m_Specific(B)))) {<br>
+ Sub = cast<BinaryOperator>(U);<br>
+ break;<br>
+ }<br>
+<br>
+ // A + (-C), A u< C (canonicalized form of (sub A, C))<br>
+ const APInt *CmpC, *AddC;<br>
+ if (match(U, m_Add(m_Specific(A), m_APInt(AddC)))
&&<br>
+ match(B, m_APInt(CmpC)) && *AddC == -(*CmpC))
{<br>
+ Sub = cast<BinaryOperator>(U);<br>
+ break;<br>
+ }<br>
+ }<br>
+ if (!Sub)<br>
+ return false;<br>
+<br>
+ if (!TLI.shouldFormOverflowOp(ISD::USUBO,<br>
+ TLI.getValueType(DL,
Sub->getType())))<br>
+ return false;<br>
+<br>
+ // Pattern matched and profitability checked. Check
dominance to determine the<br>
+ // insertion point for an intrinsic that replaces the
subtract and compare.<br>
+ DominatorTree DT(*Sub->getFunction());<br>
+ bool SubDominates = DT.dominates(Sub, Cmp);<br>
+ if (!SubDominates && !DT.dominates(Cmp, Sub))<br>
+ return false;<br>
+ Instruction *InPt = SubDominates ?
cast<Instruction>(Sub)<br>
+ :
cast<Instruction>(Cmp);<br>
+ replaceMathCmpWithIntrinsic(Sub, Cmp, InPt,
Intrinsic::usub_with_overflow);<br>
+ // Reset callers - do not crash by iterating over a dead
instruction.<br>
+ ModifiedDT = true;<br>
+ return true;<br>
+}<br>
+<br>
/// Sink the given CmpInst into user blocks to reduce the
number of virtual<br>
/// registers that must be created and coalesced. This is a
clear win except on<br>
/// targets with multiple condition code registers (PowerPC),
where it might<br>
@@ -1276,14 +1338,17 @@ static bool sinkCmpExpression(CmpInst
*C<br>
return MadeChange;<br>
}<br>
<br>
-static bool optimizeCmpExpression(CmpInst *Cmp, const
TargetLowering &TLI,<br>
- const DataLayout &DL) {<br>
+static bool optimizeCmp(CmpInst *Cmp, const TargetLowering
&TLI,<br>
+ const DataLayout &DL, bool
&ModifiedDT) {<br>
if (sinkCmpExpression(Cmp, TLI))<br>
return true;<br>
<br>
if (combineToUAddWithOverflow(Cmp, TLI, DL))<br>
return true;<br>
<br>
+ if (combineToUSubWithOverflow(Cmp, TLI, DL, ModifiedDT))<br>
+ return true;<br>
+<br>
return false;<br>
}<br>
<br>
@@ -6770,8 +6835,8 @@ bool CodeGenPrepare::optimizeInst(Instru<br>
return false;<br>
}<br>
<br>
- if (CmpInst *CI = dyn_cast<CmpInst>(I))<br>
- if (TLI && optimizeCmpExpression(CI, *TLI, *DL))<br>
+ if (auto *Cmp = dyn_cast<CmpInst>(I))<br>
+ if (TLI && optimizeCmp(Cmp, *TLI, *DL,
ModifiedDT))<br>
return true;<br>
<br>
if (LoadInst *LI = dyn_cast<LoadInst>(I)) {<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Mon Feb 18
15:33:05 2019<br>
@@ -4934,6 +4934,13 @@ bool
X86TargetLowering::shouldScalarizeB<br>
return isOperationLegalOrCustomOrPromote(VecOp.getOpcode(),
ScalarVT);<br>
}<br>
<br>
+bool X86TargetLowering::shouldFormOverflowOp(unsigned Opcode,
EVT VT) const {<br>
+ // TODO: Allow vectors?<br>
+ if (VT.isVector())<br>
+ return false;<br>
+ return VT.isSimple() || !isOperationExpand(Opcode, VT);<br>
+}<br>
+<br>
bool X86TargetLowering::isCheapToSpeculateCttz() const {<br>
// Speculate cttz only if we can directly use TZCNT.<br>
return Subtarget.hasBMI();<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.h?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.h (original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.h Mon Feb 18
15:33:05 2019<br>
@@ -1071,6 +1071,11 @@ namespace llvm {<br>
/// supported.<br>
bool shouldScalarizeBinop(SDValue) const override;<br>
<br>
+ /// Overflow nodes should get combined/lowered to optimal
instructions<br>
+ /// (they should allow eliminating explicit compares by
getting flags from<br>
+ /// math ops).<br>
+ bool shouldFormOverflowOp(unsigned Opcode, EVT VT) const
override;<br>
+<br>
bool storeOfVectorConstantIsCheap(EVT MemVT, unsigned
NumElem,<br>
unsigned AddrSpace)
const override {<br>
// If we can replace more than 2 scalar stores, there
will be a reduction<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/cgp-usubo.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/cgp-usubo.ll?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/cgp-usubo.ll?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/cgp-usubo.ll (original)<br>
+++ llvm/trunk/test/CodeGen/X86/cgp-usubo.ll Mon Feb 18
15:33:05 2019<br>
@@ -7,8 +7,8 @@ define i1 @usubo_ult_i64(i64 %x, i64 %y,<br>
; CHECK-LABEL: usubo_ult_i64:<br>
; CHECK: # %bb.0:<br>
; CHECK-NEXT: subq %rsi, %rdi<br>
-; CHECK-NEXT: movq %rdi, (%rdx)<br>
; CHECK-NEXT: setb %al<br>
+; CHECK-NEXT: movq %rdi, (%rdx)<br>
; CHECK-NEXT: retq<br>
%s = sub i64 %x, %y<br>
store i64 %s, i64* %p<br>
@@ -21,9 +21,8 @@ define i1 @usubo_ult_i64(i64 %x, i64 %y,<br>
define i1 @usubo_ugt_i32(i32 %x, i32 %y, i32* %p) nounwind {<br>
; CHECK-LABEL: usubo_ugt_i32:<br>
; CHECK: # %bb.0:<br>
-; CHECK-NEXT: cmpl %edi, %esi<br>
-; CHECK-NEXT: seta %al<br>
; CHECK-NEXT: subl %esi, %edi<br>
+; CHECK-NEXT: setb %al<br>
; CHECK-NEXT: movl %edi, (%rdx)<br>
; CHECK-NEXT: retq<br>
%ov = icmp ugt i32 %y, %x<br>
@@ -39,8 +38,7 @@ define i1 @usubo_ugt_constant_op0_i8(i8<br>
; CHECK: # %bb.0:<br>
; CHECK-NEXT: movb $42, %cl<br>
; CHECK-NEXT: subb %dil, %cl<br>
-; CHECK-NEXT: cmpb $42, %dil<br>
-; CHECK-NEXT: seta %al<br>
+; CHECK-NEXT: setb %al<br>
; CHECK-NEXT: movb %cl, (%rsi)<br>
; CHECK-NEXT: retq<br>
%s = sub i8 42, %x<br>
@@ -54,10 +52,9 @@ define i1 @usubo_ugt_constant_op0_i8(i8<br>
define i1 @usubo_ult_constant_op0_i16(i16 %x, i16* %p)
nounwind {<br>
; CHECK-LABEL: usubo_ult_constant_op0_i16:<br>
; CHECK: # %bb.0:<br>
-; CHECK-NEXT: movl $43, %ecx<br>
-; CHECK-NEXT: subl %edi, %ecx<br>
-; CHECK-NEXT: cmpw $43, %di<br>
-; CHECK-NEXT: seta %al<br>
+; CHECK-NEXT: movw $43, %cx<br>
+; CHECK-NEXT: subw %di, %cx<br>
+; CHECK-NEXT: setb %al<br>
; CHECK-NEXT: movw %cx, (%rsi)<br>
; CHECK-NEXT: retq<br>
%s = sub i16 43, %x<br>
@@ -71,11 +68,9 @@ define i1 @usubo_ult_constant_op0_i16(i1<br>
define i1 @usubo_ult_constant_op1_i16(i16 %x, i16* %p)
nounwind {<br>
; CHECK-LABEL: usubo_ult_constant_op1_i16:<br>
; CHECK: # %bb.0:<br>
-; CHECK-NEXT: movl %edi, %ecx<br>
-; CHECK-NEXT: addl $-44, %ecx<br>
-; CHECK-NEXT: cmpw $44, %di<br>
+; CHECK-NEXT: subw $44, %di<br>
; CHECK-NEXT: setb %al<br>
-; CHECK-NEXT: movw %cx, (%rsi)<br>
+; CHECK-NEXT: movw %di, (%rsi)<br>
; CHECK-NEXT: retq<br>
%s = add i16 %x, -44<br>
%ov = icmp ult i16 %x, 44<br>
@@ -86,9 +81,8 @@ define i1 @usubo_ult_constant_op1_i16(i1<br>
define i1 @usubo_ugt_constant_op1_i8(i8 %x, i8* %p) nounwind
{<br>
; CHECK-LABEL: usubo_ugt_constant_op1_i8:<br>
; CHECK: # %bb.0:<br>
-; CHECK-NEXT: cmpb $45, %dil<br>
+; CHECK-NEXT: subb $45, %dil<br>
; CHECK-NEXT: setb %al<br>
-; CHECK-NEXT: addb $-45, %dil<br>
; CHECK-NEXT: movb %dil, (%rsi)<br>
; CHECK-NEXT: retq<br>
%ov = icmp ugt i8 45, %x<br>
@@ -102,11 +96,9 @@ define i1 @usubo_ugt_constant_op1_i8(i8<br>
define i1 @usubo_eq_constant1_op1_i32(i32 %x, i32* %p)
nounwind {<br>
; CHECK-LABEL: usubo_eq_constant1_op1_i32:<br>
; CHECK: # %bb.0:<br>
-; CHECK-NEXT: # kill: def $edi killed $edi def $rdi<br>
-; CHECK-NEXT: leal -1(%rdi), %ecx<br>
-; CHECK-NEXT: testl %edi, %edi<br>
-; CHECK-NEXT: sete %al<br>
-; CHECK-NEXT: movl %ecx, (%rsi)<br>
+; CHECK-NEXT: subl $1, %edi<br>
+; CHECK-NEXT: setb %al<br>
+; CHECK-NEXT: movl %edi, (%rsi)<br>
; CHECK-NEXT: retq<br>
%s = add i32 %x, -1<br>
%ov = icmp eq i32 %x, 0<br>
@@ -124,17 +116,14 @@ define i1 @usubo_ult_sub_dominates_i64(i<br>
; CHECK-NEXT: testb $1, %cl<br>
; CHECK-NEXT: je .LBB7_2<br>
; CHECK-NEXT: # %bb.1: # %t<br>
-; CHECK-NEXT: movq %rdi, %rax<br>
-; CHECK-NEXT: subq %rsi, %rax<br>
-; CHECK-NEXT: movq %rax, (%rdx)<br>
-; CHECK-NEXT: testb $1, %cl<br>
-; CHECK-NEXT: je .LBB7_2<br>
-; CHECK-NEXT: # %bb.3: # %end<br>
-; CHECK-NEXT: cmpq %rsi, %rdi<br>
+; CHECK-NEXT: subq %rsi, %rdi<br>
; CHECK-NEXT: setb %al<br>
-; CHECK-NEXT: retq<br>
+; CHECK-NEXT: movq %rdi, (%rdx)<br>
+; CHECK-NEXT: testb $1, %cl<br>
+; CHECK-NEXT: jne .LBB7_3<br>
; CHECK-NEXT: .LBB7_2: # %f<br>
; CHECK-NEXT: movl %ecx, %eax<br>
+; CHECK-NEXT: .LBB7_3: # %end<br>
; CHECK-NEXT: retq<br>
entry:<br>
br i1 %cond, label %t, label %f<br>
<br>
Modified: llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll
(original)<br>
+++ llvm/trunk/test/CodeGen/X86/lsr-loop-exit-cond.ll Mon Feb
18 15:33:05 2019<br>
@@ -16,11 +16,11 @@ define void @t(i8* nocapture %in, i8* no<br>
; GENERIC-NEXT: movl (%rdx), %eax<br>
; GENERIC-NEXT: movl 4(%rdx), %ebx<br>
; GENERIC-NEXT: decl %ecx<br>
-; GENERIC-NEXT: leaq 20(%rdx), %r14<br>
+; GENERIC-NEXT: leaq 20(%rdx), %r11<br>
; GENERIC-NEXT: movq _Te0@{{.*}}(%rip), %r9<br>
; GENERIC-NEXT: movq _Te1@{{.*}}(%rip), %r8<br>
; GENERIC-NEXT: movq _Te3@{{.*}}(%rip), %r10<br>
-; GENERIC-NEXT: movq %rcx, %r11<br>
+; GENERIC-NEXT: movq %rcx, %r14<br>
; GENERIC-NEXT: jmp LBB0_1<br>
; GENERIC-NEXT: .p2align 4, 0x90<br>
; GENERIC-NEXT: LBB0_2: ## %bb1<br>
@@ -29,14 +29,13 @@ define void @t(i8* nocapture %in, i8* no<br>
; GENERIC-NEXT: shrl $16, %ebx<br>
; GENERIC-NEXT: movzbl %bl, %ebx<br>
; GENERIC-NEXT: xorl (%r8,%rbx,4), %eax<br>
-; GENERIC-NEXT: xorl -4(%r14), %eax<br>
+; GENERIC-NEXT: xorl -4(%r11), %eax<br>
; GENERIC-NEXT: shrl $24, %edi<br>
; GENERIC-NEXT: movzbl %bpl, %ebx<br>
; GENERIC-NEXT: movl (%r10,%rbx,4), %ebx<br>
; GENERIC-NEXT: xorl (%r9,%rdi,4), %ebx<br>
-; GENERIC-NEXT: xorl (%r14), %ebx<br>
-; GENERIC-NEXT: decq %r11<br>
-; GENERIC-NEXT: addq $16, %r14<br>
+; GENERIC-NEXT: xorl (%r11), %ebx<br>
+; GENERIC-NEXT: addq $16, %r11<br>
; GENERIC-NEXT: LBB0_1: ## %bb<br>
; GENERIC-NEXT: ## =>This Inner Loop Header: Depth=1<br>
; GENERIC-NEXT: movzbl %al, %edi<br>
@@ -47,16 +46,16 @@ define void @t(i8* nocapture %in, i8* no<br>
; GENERIC-NEXT: movzbl %bpl, %ebp<br>
; GENERIC-NEXT: movl (%r8,%rbp,4), %ebp<br>
; GENERIC-NEXT: xorl (%r9,%rax,4), %ebp<br>
-; GENERIC-NEXT: xorl -12(%r14), %ebp<br>
+; GENERIC-NEXT: xorl -12(%r11), %ebp<br>
; GENERIC-NEXT: shrl $24, %ebx<br>
; GENERIC-NEXT: movl (%r10,%rdi,4), %edi<br>
; GENERIC-NEXT: xorl (%r9,%rbx,4), %edi<br>
-; GENERIC-NEXT: xorl -8(%r14), %edi<br>
+; GENERIC-NEXT: xorl -8(%r11), %edi<br>
; GENERIC-NEXT: movl %ebp, %eax<br>
; GENERIC-NEXT: shrl $24, %eax<br>
; GENERIC-NEXT: movl (%r9,%rax,4), %eax<br>
-; GENERIC-NEXT: testq %r11, %r11<br>
-; GENERIC-NEXT: jne LBB0_2<br>
+; GENERIC-NEXT: subq $1, %r14<br>
+; GENERIC-NEXT: jae LBB0_2<br>
; GENERIC-NEXT: ## %bb.3: ## %bb2<br>
; GENERIC-NEXT: shlq $4, %rcx<br>
; GENERIC-NEXT: andl $-16777216, %eax ## imm = 0xFF000000<br>
@@ -99,27 +98,26 @@ define void @t(i8* nocapture %in, i8* no<br>
; ATOM-NEXT: ## kill: def $ecx killed $ecx def $rcx<br>
; ATOM-NEXT: movl (%rdx), %r15d<br>
; ATOM-NEXT: movl 4(%rdx), %eax<br>
-; ATOM-NEXT: leaq 20(%rdx), %r14<br>
+; ATOM-NEXT: leaq 20(%rdx), %r11<br>
; ATOM-NEXT: movq _Te0@{{.*}}(%rip), %r9<br>
; ATOM-NEXT: movq _Te1@{{.*}}(%rip), %r8<br>
; ATOM-NEXT: movq _Te3@{{.*}}(%rip), %r10<br>
; ATOM-NEXT: decl %ecx<br>
-; ATOM-NEXT: movq %rcx, %r11<br>
+; ATOM-NEXT: movq %rcx, %r14<br>
; ATOM-NEXT: jmp LBB0_1<br>
; ATOM-NEXT: .p2align 4, 0x90<br>
; ATOM-NEXT: LBB0_2: ## %bb1<br>
; ATOM-NEXT: ## in Loop: Header=BB0_1 Depth=1<br>
; ATOM-NEXT: shrl $16, %eax<br>
; ATOM-NEXT: shrl $24, %edi<br>
-; ATOM-NEXT: decq %r11<br>
-; ATOM-NEXT: movzbl %al, %ebp<br>
+; ATOM-NEXT: movzbl %al, %eax<br>
+; ATOM-NEXT: xorl (%r8,%rax,4), %r15d<br>
; ATOM-NEXT: movzbl %bl, %eax<br>
; ATOM-NEXT: movl (%r10,%rax,4), %eax<br>
-; ATOM-NEXT: xorl (%r8,%rbp,4), %r15d<br>
+; ATOM-NEXT: xorl -4(%r11), %r15d<br>
; ATOM-NEXT: xorl (%r9,%rdi,4), %eax<br>
-; ATOM-NEXT: xorl -4(%r14), %r15d<br>
-; ATOM-NEXT: xorl (%r14), %eax<br>
-; ATOM-NEXT: addq $16, %r14<br>
+; ATOM-NEXT: xorl (%r11), %eax<br>
+; ATOM-NEXT: addq $16, %r11<br>
; ATOM-NEXT: LBB0_1: ## %bb<br>
; ATOM-NEXT: ## =>This Inner Loop Header: Depth=1<br>
; ATOM-NEXT: movl %eax, %edi<br>
@@ -132,15 +130,15 @@ define void @t(i8* nocapture %in, i8* no<br>
; ATOM-NEXT: movzbl %r15b, %edi<br>
; ATOM-NEXT: xorl (%r9,%rbp,4), %ebx<br>
; ATOM-NEXT: movl (%r10,%rdi,4), %edi<br>
-; ATOM-NEXT: xorl -12(%r14), %ebx<br>
+; ATOM-NEXT: xorl -12(%r11), %ebx<br>
; ATOM-NEXT: xorl (%r9,%rax,4), %edi<br>
; ATOM-NEXT: movl %ebx, %eax<br>
-; ATOM-NEXT: xorl -8(%r14), %edi<br>
+; ATOM-NEXT: xorl -8(%r11), %edi<br>
; ATOM-NEXT: shrl $24, %eax<br>
; ATOM-NEXT: movl (%r9,%rax,4), %r15d<br>
-; ATOM-NEXT: testq %r11, %r11<br>
+; ATOM-NEXT: subq $1, %r14<br>
; ATOM-NEXT: movl %edi, %eax<br>
-; ATOM-NEXT: jne LBB0_2<br>
+; ATOM-NEXT: jae LBB0_2<br>
; ATOM-NEXT: ## %bb.3: ## %bb2<br>
; ATOM-NEXT: shrl $16, %eax<br>
; ATOM-NEXT: shrl $8, %edi<br>
<br>
Modified:
llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll?rev=354298&r1=354297&r2=354298&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll?rev=354298&r1=354297&r2=354298&view=diff</a><br>
==============================================================================<br>
---
llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll
(original)<br>
+++
llvm/trunk/test/Transforms/CodeGenPrepare/X86/overflow-intrinsics.ll
Mon Feb 18 15:33:05 2019<br>
@@ -175,10 +175,11 @@ define i1 @uaddo_i42_increment_illegal_t<br>
<br>
define i1 @usubo_ult_i64(i64 %x, i64 %y, i64* %p) {<br>
; CHECK-LABEL: @usubo_ult_i64(<br>
-; CHECK-NEXT: [[S:%.*]] = sub i64 [[X:%.*]], [[Y:%.*]]<br>
-; CHECK-NEXT: store i64 [[S]], i64* [[P:%.*]]<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ult i64 [[X]], [[Y]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i64, i1 }
@llvm.usub.with.overflow.i64(i64 [[X:%.*]], i64 [[Y:%.*]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i64, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i64, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i64 [[MATH]], i64* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%s = sub i64 %x, %y<br>
store i64 %s, i64* %p<br>
@@ -190,10 +191,11 @@ define i1 @usubo_ult_i64(i64 %x, i64 %y,<br>
<br>
define i1 @usubo_ugt_i32(i32 %x, i32 %y, i32* %p) {<br>
; CHECK-LABEL: @usubo_ugt_i32(<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ugt i32 [[Y:%.*]],
[[X:%.*]]<br>
-; CHECK-NEXT: [[S:%.*]] = sub i32 [[X]], [[Y]]<br>
-; CHECK-NEXT: store i32 [[S]], i32* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i32, i1 }
@llvm.usub.with.overflow.i32(i32 [[X:%.*]], i32 [[Y:%.*]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i32, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i32, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i32 [[MATH]], i32* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%ov = icmp ugt i32 %y, %x<br>
%s = sub i32 %x, %y<br>
@@ -205,10 +207,11 @@ define i1 @usubo_ugt_i32(i32 %x, i32 %y,<br>
<br>
define i1 @usubo_ugt_constant_op0_i8(i8 %x, i8* %p) {<br>
; CHECK-LABEL: @usubo_ugt_constant_op0_i8(<br>
-; CHECK-NEXT: [[S:%.*]] = sub i8 42, [[X:%.*]]<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ugt i8 [[X]], 42<br>
-; CHECK-NEXT: store i8 [[S]], i8* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i8, i1 }
@llvm.usub.with.overflow.i8(i8 42, i8 [[X:%.*]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i8, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i8, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i8 [[MATH]], i8* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%s = sub i8 42, %x<br>
%ov = icmp ugt i8 %x, 42<br>
@@ -220,10 +223,11 @@ define i1 @usubo_ugt_constant_op0_i8(i8<br>
<br>
define i1 @usubo_ult_constant_op0_i16(i16 %x, i16* %p) {<br>
; CHECK-LABEL: @usubo_ult_constant_op0_i16(<br>
-; CHECK-NEXT: [[S:%.*]] = sub i16 43, [[X:%.*]]<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ult i16 43, [[X]]<br>
-; CHECK-NEXT: store i16 [[S]], i16* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i16, i1 }
@llvm.usub.with.overflow.i16(i16 43, i16 [[X:%.*]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i16, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i16, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i16 [[MATH]], i16* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%s = sub i16 43, %x<br>
%ov = icmp ult i16 43, %x<br>
@@ -235,10 +239,11 @@ define i1 @usubo_ult_constant_op0_i16(i1<br>
<br>
define i1 @usubo_ult_constant_op1_i16(i16 %x, i16* %p) {<br>
; CHECK-LABEL: @usubo_ult_constant_op1_i16(<br>
-; CHECK-NEXT: [[S:%.*]] = add i16 [[X:%.*]], -44<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ult i16 [[X]], 44<br>
-; CHECK-NEXT: store i16 [[S]], i16* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i16, i1 }
@llvm.usub.with.overflow.i16(i16 [[X:%.*]], i16 44)<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i16, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i16, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i16 [[MATH]], i16* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%s = add i16 %x, -44<br>
%ov = icmp ult i16 %x, 44<br>
@@ -248,10 +253,11 @@ define i1 @usubo_ult_constant_op1_i16(i1<br>
<br>
define i1 @usubo_ugt_constant_op1_i8(i8 %x, i8* %p) {<br>
; CHECK-LABEL: @usubo_ugt_constant_op1_i8(<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ugt i8 45, [[X:%.*]]<br>
-; CHECK-NEXT: [[S:%.*]] = add i8 [[X]], -45<br>
-; CHECK-NEXT: store i8 [[S]], i8* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i8, i1 }
@llvm.usub.with.overflow.i8(i8 [[X:%.*]], i8 45)<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i8, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i8, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i8 [[MATH]], i8* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%ov = icmp ugt i8 45, %x<br>
%s = add i8 %x, -45<br>
@@ -263,10 +269,11 @@ define i1 @usubo_ugt_constant_op1_i8(i8<br>
<br>
define i1 @usubo_eq_constant1_op1_i32(i32 %x, i32* %p) {<br>
; CHECK-LABEL: @usubo_eq_constant1_op1_i32(<br>
-; CHECK-NEXT: [[S:%.*]] = add i32 [[X:%.*]], -1<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp eq i32 [[X]], 0<br>
-; CHECK-NEXT: store i32 [[S]], i32* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: [[TMP1:%.*]] = call { i32, i1 }
@llvm.usub.with.overflow.i32(i32 [[X:%.*]], i32 1)<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i32, i1 }
[[TMP1]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i32, i1 }
[[TMP1]], 1<br>
+; CHECK-NEXT: store i32 [[MATH]], i32* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
%s = add i32 %x, -1<br>
%ov = icmp eq i32 %x, 0<br>
@@ -283,14 +290,15 @@ define i1 @usubo_ult_sub_dominates_i64(i<br>
; CHECK-NEXT: entry:<br>
; CHECK-NEXT: br i1 [[COND:%.*]], label [[T:%.*]], label
[[F:%.*]]<br>
; CHECK: t:<br>
-; CHECK-NEXT: [[S:%.*]] = sub i64 [[X:%.*]], [[Y:%.*]]<br>
-; CHECK-NEXT: store i64 [[S]], i64* [[P:%.*]]<br>
+; CHECK-NEXT: [[TMP0:%.*]] = call { i64, i1 }
@llvm.usub.with.overflow.i64(i64 [[X:%.*]], i64 [[Y:%.*]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i64, i1 }
[[TMP0]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i64, i1 }
[[TMP0]], 1<br>
+; CHECK-NEXT: store i64 [[MATH]], i64* [[P:%.*]]<br>
; CHECK-NEXT: br i1 [[COND]], label [[END:%.*]], label
[[F]]<br>
; CHECK: f:<br>
; CHECK-NEXT: ret i1 [[COND]]<br>
; CHECK: end:<br>
-; CHECK-NEXT: [[OV:%.*]] = icmp ult i64 [[X]], [[Y]]<br>
-; CHECK-NEXT: ret i1 [[OV]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
entry:<br>
br i1 %cond, label %t, label %f<br>
@@ -319,10 +327,11 @@ define i1 @usubo_ult_cmp_dominates_i64(i<br>
; CHECK: f:<br>
; CHECK-NEXT: ret i1 [[COND]]<br>
; CHECK: end:<br>
-; CHECK-NEXT: [[TMP0:%.*]] = icmp ult i64 [[X]], [[Y]]<br>
-; CHECK-NEXT: [[S:%.*]] = sub i64 [[X]], [[Y]]<br>
-; CHECK-NEXT: store i64 [[S]], i64* [[P:%.*]]<br>
-; CHECK-NEXT: ret i1 [[TMP0]]<br>
+; CHECK-NEXT: [[TMP0:%.*]] = call { i64, i1 }
@llvm.usub.with.overflow.i64(i64 [[X]], i64 [[Y]])<br>
+; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i64, i1 }
[[TMP0]], 0<br>
+; CHECK-NEXT: [[OV1:%.*]] = extractvalue { i64, i1 }
[[TMP0]], 1<br>
+; CHECK-NEXT: store i64 [[MATH]], i64* [[P:%.*]]<br>
+; CHECK-NEXT: ret i1 [[OV1]]<br>
;<br>
entry:<br>
br i1 %cond, label %t, label %f<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail-m_-8983031789758981807gmail-m_8370979458647442368gmail-m_-1143257621552115361gmail-m_-8399220206292982619gmail_signature">
<div dir="ltr">
<div><span style="font-family:Times;font-size:medium">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small">
<td style="border-top:2px solid rgb(213,15,37)" nowrap>Teresa Johnson |</td>
<td style="border-top:2px solid rgb(51,105,232)" nowrap> Software Engineer |</td>
<td style="border-top:2px solid rgb(0,153,57)" nowrap> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td>
<td style="border-top:2px solid rgb(238,178,17)" nowrap><br>
</td>
</tr>
</tbody>
</table>
</span></div>
</div>
</div>
<br>
<fieldset class="gmail-m_-8983031789758981807gmail-m_8370979458647442368mimeAttachmentHeader"></fieldset>
<pre class="gmail-m_-8983031789758981807gmail-m_8370979458647442368moz-quote-pre">_______________________________________________
llvm-commits mailing list
<a class="gmail-m_-8983031789758981807gmail-m_8370979458647442368moz-txt-link-abbreviated" href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>
<a class="gmail-m_-8983031789758981807gmail-m_8370979458647442368moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a>
</pre>
</blockquote>
</div>
</blockquote></div>