[PATCH] PR19958 wrong code at -O1 and above on x86_64-linux-gnu (InstCombine)

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Jul 11 16:06:51 PDT 2014


> On 2014-Jul-11, at 13:37, suyog sarda <sardask01 at gmail.com> wrote:
> 
> + Duncan
> 
> On Sat, Jul 12, 2014 at 1:54 AM, suyog <suyog.sarda at samsung.com> wrote:
> Hi Duncan,
> 
> Thanks for your review.
> Couldn't spot the miscompiles earlier because i didn't comment out
> the 'simplify' call. Now i modified the code and tested almost every
> combination.
> 
> I have modified the code as per your suggestions :
> - Made comments short and crisp as per your suggestion.
> - included lambda helper functions to return appropriate constant
>   and icmp instruction.
> - included logic for both constants -1 for ashr as suggested
>   (In my opinion if both constants are equal and not -1,
>    then final icmp would be icmp eq/ne A, 0. The Predicate won't
>    get inversed and will remain same. You had suggested using lambda
>    function here to get inversed predicate. Correct me if my analysis is wrong.)
> - used 'bool IsAShr' to avoid recalculation.
> - removed LShrOperator block which was causing miscompiles.
> - removed unnecessary shl/lshr distinction for exact and non exact
> - moved all test cases to new file icmp-ashr.ll
> - used msb_high/low instead of +ve/-ve and opposite msb
> - tried including all combinations of ne/eq + exact/non-exact + lshr/ashr
> 
> Please help in reviewing the patch.
> 
> Your comments/suggestions are valuable and most awaited. :)
> 
> Thanks,
> Suyog
> 
> http://reviews.llvm.org/D4068
> 
> Files:
>   lib/Transforms/InstCombine/InstCombineCompares.cpp
>   test/Transforms/InstCombine/icmp-shr.ll

This is getting close.  I have a bunch of style changes below.

In terms of overall code organization:

 1. You should drop this block below the `switch` that canonicalises
    LE/GE to GT/LT.  Eventually, you're planning to expand this logic to
    handle the rest of `icmp`, so this will be a better spot.

 2. Most of the logic should move to its own function.  All that should
    be in `InstCombiner::visitICmpInst()` is:

        if (I.isEquality()) {
          ConstantInt *CI2;
          if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(A))) ||
              match(Op0, m_LShr(m_ConstantInt(CI2), m_Value(A))))
            return foldICmpCstShrCst(I, CI, Op0, CI2, A);
        }

    In the new function `foldICmpCstShrCst()`, you have the opportunity
    to pick clearer variable names and un-indent the logic.  More
    importantly, `InstCombiner::visitICmpInst()` will be easier to read.

    `foldICmpCstShrCst()` should start with an assert:

        assert(I.isEquality() && "Cannot fold icmp gt/lt");

    When you come back to optimize GT and LT, you can remove both of the
    `isEquality()` checks at the same time.

> Index: lib/Transforms/InstCombine/InstCombineCompares.cpp
> ===================================================================
> --- lib/Transforms/InstCombine/InstCombineCompares.cpp
> +++ lib/Transforms/InstCombine/InstCombineCompares.cpp
> @@ -2459,6 +2459,99 @@
>        return new ICmpInst(I.getPredicate(), A, B);
>      }
>  
> +    // PR19753:

Leave this out.  You'll include this in the commit message, which is all
the mention the PR needs.

> +    // (icmp eq/ne (ashr/lshr const2, A), const1) ->
> +    // (icmp eq/ne A, Log2(const2/const1)) ->
> +    // (icmp eq/ne A, Log2(const2) - Log2(const1)).
> +    // TODO : Handle this for other icmp instructions.

Leave this TODO out also.  That can just be tracked by a PR.

> +    if (I.isEquality()) {
> +      ConstantInt *CI2;
> +      if (match(Op0, m_AShr(m_ConstantInt(CI2), m_Value(A))) ||
> +          match(Op0, m_LShr(m_ConstantInt(CI2), m_Value(A)))) {
> +        APInt AP1 = CI->getValue();
> +        APInt AP2 = CI2->getValue();
> +        int Shift;
> +        bool IsAShr = isa<AShrOperator>(Op0);

This variables should be declared and defined just before each of their
first uses.

> +
> +        // Helper lambda function to return appropriate constants.

I don't think this comment adds anything.  I think these lambdas are
short enough that they're self-explanatory.

> +        auto getConstant = [&I, this](bool IsTrue) {
> +          if (I.getPredicate() == I.ICMP_NE)
> +            IsTrue = !IsTrue;
> +          return ReplaceInstUsesWith(I, ConstantInt::get(I.getType(), IsTrue));
> +        };
> +
> +        // Helper lambda function to return appropriate icmp instruction.

Leave this out too.

> +        auto getICmp = [&I](CmpInst::Predicate Pred, Value *LHS, Value *RHS) {
> +          if (I.getPredicate() == I.ICMP_NE)
> +            Pred = CmpInst::getInversePredicate(Pred);
> +          return new ICmpInst(Pred, LHS, RHS);
> +        };
> +

Move AP1 and AP2 to here.

> +        if (!AP1) {
> +          if (!AP2) {

These braces aren't necessary.

> +            // Both Constant are 0.

s/Constant/constants/

> +            return getConstant(true);
> +          }
> +
> +          if (cast<BinaryOperator>(Op0)->isExact())
> +            return getConstant(false);
> +
> +          if (AP2.isNegative()) {

These braces aren't necessary.

> +            // MSB is set, so a lshr with a large enough 'A' would be undefined.
> +            return getConstant(false);
> +          }
> +
> +          // 'A' must be large enough to shift out the MSB.

s/MSB/highest set bit/

You got this comment from my last review, but it's wrong (sorry).  The
MSB is the sign bit (queried by `isNegative()`) -- that's a different
thing.  I think "highest set bit" is better.

> +          Shift = AP2.logBase2();

I don't think using `Shift` helps here.  Fold it into the statement that
follows.

> +          return getICmp(I.ICMP_UGT, A, ConstantInt::get(A->getType(), Shift));
> +        }
> +
> +        if (!AP2) {

These braces aren't necessary.

> +          // Shifting 0 by any value gives 0.
> +          return getConstant(false);
> +        }
> +

Define `IsAShr` here, just before its first use.

> +        // if const1 = const2 -> icmp eq/ne A, 0

I think you should leave this comment out.  The code that follows is
easy to read -- you don't need to say what it does.  (When I see a
comment, I stop reading the code and read the comment first.  If it's
just describing exactly what the code says, it's not useful.)

> +        if (AP1 == AP2) {
> +          if (AP1.isAllOnesValue() && IsAShr) {

These braces aren't necessary.

> +            // Arithmatic shift of -1 is -1

Add a '.' at the end of this sentence, and I'd say "is always" instead
of "is".

> +            return getConstant(true);
> +          }
> +          return new ICmpInst(I.getPredicate(), A,
> +                              ConstantInt::getNullValue(A->getType()));

For consistency, you should use `getICmp(ICMP_EQ)` here.

> +        }
> +
> +        if (IsAShr) {
> +          if (AP1.isNegative() != AP2.isNegative()) {

These braces aren't necessary.

> +            // Arithmetic shift will never change the sign.
> +            return getConstant(false);
> +          }
> +
> +          // Both the constants are negative, take their positive to calculate
> +          // log.
> +          if (AP1.isNegative()) {
> +            AP1 = -AP1;
> +            AP2 = -AP2;
> +          }
> +        }
> +
> +        if (AP1.ugt(AP2)) {

These braces aren't necessary.

> +          // Right-shifting will not increase the value

Add a '.' at the end of this sentence.

> +          return getConstant(false);
> +        }
> +
> +        // Get the distance between the highest bit that's set.
> +        Shift = AP2.logBase2() - AP1.logBase2();

Declare `Shift` right here where it's defined; i.e.:

    int Shift = AP2.logBase2() - AP1.logBase2();

> +
> +        // Use lshr here, since we've canonicalized to +ve numbers.
> +        if (AP1 == AP2.lshr(Shift))
> +          return new ICmpInst(I.getPredicate(), A,
> +                              ConstantInt::get(A->getType(), Shift));

For consistency, you should use `getICmp(ICMP_EQ)` here.

Also, I think a blank line before the succeeding comment will make this
easier to read.

> +        // Shifting const2 will never be equal to const1.
> +        return getConstant(false);
> +      }
> +    }
> +
>      // If we have an icmp le or icmp ge instruction, turn it into the
>      // appropriate icmp lt or icmp gt instruction.  This allows us to rely on
>      // them being folded in the code below.  The SimplifyICmpInst code has
> 




More information about the llvm-commits mailing list