[PATCH] PR19958 wrong code at -O1 and above on x86_64-linux-gnu (InstCombine)
suyog
suyog.sarda at samsung.com
Mon Jun 9 04:07:25 PDT 2014
Hi rafael, spatel, nlewycky,
This patch rectifies icmp instruction combine problem seen because of previous optimization patch for 'ashr/lshr exact' as mentioned in bug 19958.
I admit i wrote the previous patch keeping only equality in mind (not considering all cases is bad practice, will be careful in future).
This patch for now handles 'icmp eq/ne' only for 'ashr/lsher exact'. Added relevant test cases.
For other icmp instructions, some observations:
define i1 @f(i32 %x) {
%y = lshr exact i32 8, %x
%cmp = icmp ult i32 %y, 2
ret i1 %cmp
}
Here, in above example, as %x increases, %y decreases and hence comparison with same icmp instruction was giving wrong output.
while in ,
define i1 @f(i32 %x) {
%y = lshr exact i32 -8, %x
%cmp = icmp ult i32 %y, -2
ret i1 %cmp
}
as %x increases, %y also increases mathematically. Hence comparison with same icmp instruction is fine in this case.
This was not captured in original patch and hence wrong output. This can be handled in following two ways:
1. if const1,const2 are both positive, then icmp instruction should be swapped (something like getSwappedPredicate()).
if const1,const2 are both negative, then keep the icmp instruction as it is.
Note : getInversePredicate() won't work.
2. if const1/const2 are both positive, swap the operands, keeping the icmp instruction as it is.
No change if const1,const2 are both negative.
Note : Cases where const1 and const2 have opposite sign - one is positive other is negative are always true/false.
This is handled by simplifyicmp call and it never reaches our patch. I checked it with both ashr/lshr for this case.
I will verify if above two approaches for other icmp instructions are universally true for both ashr/lshr and come up with another patch.
Added a TODO for the same.
Please review this patch which handles equality for ashr/lshr exact.
Any comments/suggestions are most welcomed.
Thanks.
- Suyog
http://reviews.llvm.org/D4068
Files:
lib/Transforms/InstCombine/InstCombineCompares.cpp
test/Transforms/InstCombine/icmp.ll
Index: lib/Transforms/InstCombine/InstCombineCompares.cpp
===================================================================
--- lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -2318,6 +2318,30 @@
return GlobalSwapBenefits > 0;
}
+// Helper function to check whether Op represents a lshr/ashr exact
+// instruction. For example:
+// (icmp eq/ne (ashr exact const2, A), const1) -> icmp eq/ne A, Log2(const2/const1)
+// Here if Op represents -> (ashr exact const2, A), and CI represents
+// const1, we compute Quotient as const2/const1.
+static bool checkShrExact(Value *Op, APInt &Quotient, const ConstantInt *CI,
+ Value *&A) {
+ ConstantInt *CI2;
+ if (match(Op, m_AShr(m_ConstantInt(CI2), m_Value(A))) &&
+ (cast<BinaryOperator>(Op)->isExact())) {
+ Quotient = CI2->getValue().sdiv(CI->getValue());
+ return true;
+ }
+
+ // Handle the case for lhsr.
+ if (match(Op, m_LShr(m_ConstantInt(CI2), m_Value(A))) &&
+ (cast<BinaryOperator>(Op)->isExact())) {
+ Quotient = CI2->getValue().udiv(CI->getValue());
+ return true;
+ }
+
+ return false;
+}
+
Instruction *InstCombiner::visitICmpInst(ICmpInst &I) {
bool Changed = false;
Value *Op0 = I.getOperand(0), *Op1 = I.getOperand(1);
@@ -2439,6 +2463,21 @@
return new ICmpInst(I.getPredicate(), A, B);
}
+ // PR19753:
+ // (icmp eq/ne (ashr exact const2, A), const1) -> icmp eq/ne A, Log2(const2/const1)
+ // Cases where const1 doesn't divide const2 exactly or Quotient is not
+ // exact of log2 are handled by SimplifyICmpInst call above where we
+ // return false. Similar for lshr.
+ // TODO : Handle this for other icmp instructions.
+ {
+ APInt Quotient;
+ if (I.isEquality() && checkShrExact(Op0, Quotient, CI, A)) {
+ unsigned shift = Quotient.logBase2();
+ return new ICmpInst(I.getPredicate(), A,
+ ConstantInt::get(A->getType(), shift));
+ }
+ }
+
// If we have an icmp le or icmp ge instruction, turn it into the
// appropriate icmp lt or icmp gt instruction. This allows us to rely on
// them being folded in the code below. The SimplifyICmpInst code has
Index: test/Transforms/InstCombine/icmp.ll
===================================================================
--- test/Transforms/InstCombine/icmp.ll
+++ test/Transforms/InstCombine/icmp.ll
@@ -1382,3 +1382,19 @@
%2 = icmp slt i32 %1, -10
ret i1 %2
}
+
+; CHECK-LABEL: @exact_ashr_eq
+; CHECK-NEXT: icmp eq i32 %a, 1
+define i1 @exact_ashr_eq(i32 %a) {
+ %shr = ashr exact i32 -30, %a
+ %cmp = icmp eq i32 %shr, -15
+ ret i1 %cmp
+}
+
+; CHECK-LABEL: @exact_lhsr_eq
+; CHECK-NEXT: icmp eq i32 %a, 3
+define i1 @exact_lhsr_eq(i32 %a) {
+ %shr = lshr exact i32 80, %a
+ %cmp = icmp eq i32 %shr, 10
+ ret i1 %cmp
+}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D4068.10235.patch
Type: text/x-patch
Size: 2882 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140609/1c95558f/attachment.bin>
More information about the llvm-commits
mailing list