[PATCH] D52001: [InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)

Thu Sep 13 12:22:26 PDT 2018

spatel accepted this revision.
spatel added a comment.
This revision is now accepted and ready to land.

In https://reviews.llvm.org/D52001#1233786, @lebedev.ri wrote:

> In https://reviews.llvm.org/D52001#1233768, @spatel wrote:
>
> > I was wondering if it would be easier to read if we swapped everything as a first step, so something like this in the existing code:
> >
> >   // We want X to be the icmp's second operand, so swap if not.
> >   Value *Cmp0 = Cmp.getOperand(0), *X = Cmp.getOperand(1);
> >   ICmpInst::Predicate Pred = Cmp.getPredicate(), NewPred;
> >   if (match(X, m_OneUse(m_Shl(m_One(), m_Value())))) {
> >     std::swap(Cmp0, X);
> >     Pred = Cmp.getSwappedPredicate();
> >   }
> >   Value *Y;
> >   if (!match(Cmp0, m_OneUse(m_Shl(m_One(), m_Value(Y)))))
> >     return nullptr;
> >   if (Pred == ICmpInst::ICMP_ULE) NewPred = ICmpInst::ICMP_NE;
> >   else if (Pred == ICmpInst::ICMP_UGT) NewPred = ICmpInst::ICMP_EQ;
> >   else return nullptr;
> >   
> >   
>
>
> I'm not sure.
>  I think we really should have two separate matches, and two `switch()`es.
>  Else i think we may use the wrong predicate..
>  Also, i *think* i will add one/two patterns to this new matcher (less canonical variants with extra uses),
>  so specifying the pattern twice seems sub-par.

Ok - just wanted to throw it out as a possibility. I agree that the switch version is less likely to go buggy.

>> 
>> 
>>   define i1 @p0(i8 %shamt0, i8 %shamt1) {
>>     %t0 = shl i8 1, %shamt0
>>     %t1 = shl i8 1, %shamt1
>>     %r = icmp ugt i8 %t0, %t1
>>     ret i1 %r
>>   }
>>   
>> 
>> 
>> With the current code, it gets transformed to:
>> 
>>   define i1 @p0(i8 %shamt0, i8 %shamt1) {
>>     %t1 = shl i8 1, %shamt1
>>     %t1.highbits = lshr i8 %t1, %shamt0
>>     %r = icmp eq i8 %t1.highbits, 0
>>     ret i1 %r
>>   }
>>   
>> 
>> 
>> It should be reduced:
>> 
>>   %t0 = shl i8 1, %shamt0
>>   %t1 = shl i8 1, %shamt1
>>   %r = icmp ugt i8 %t0, %t1
>>   =>
>>   %r = icmp ugt i8 %shamt0, %shamt1
>>    
>> 
>> I don't think we need to hold up this patch for that, but maybe it changes the way we want to implement it?
> 
> I acknowledge that there is some problem when we have the same/similar pattern on the both sides,
>  i have thought about it a bit (https://reviews.llvm.org/rL342076), but i don't have anything concrete on that.

Sounds good. Please add tests where both sides match, so we have some evidence of the missing folds. Mostly, I'm paranoid that we'll open up some infinite looping scenario if we don't have tests for those unexpected patterns. As we're finding with min/max, it's hard to see those problems in advance.

LGTM

Repository:
  rL LLVM

https://reviews.llvm.org/D52001