[Patch]New InstCombine pattern for Icmp

Nuno Lopes nuno.lopes at ist.utl.pt
Thu Aug 14 02:43:41 PDT 2014


BTW, just one more comment regarding the following lines:
+        APInt RRangeLow = UMaxValue - RAddCst->getValue() + 1;
+        APInt LRangeLow = UMaxValue - LAddCst->getValue() + 1;

We have that UMaxValue + 1 == 0. Therefore those lines can be simplified to:
APInt RRangeLow = - RAddCst->getValue();
APInt LRangeLow = - LAddCst->getValue();


The full spec in Alive is:

Pre: C1 u> C3 && C2 u> C3 && isPowerOf2(C1 ^ C2) && isPowerOf2(-C1 ^ -C2) && 
(-C1 ^ -C2) == ((C3-C1) ^ (C3-C2)) && abs(C1-C2) u> C3
%x = add %A, C1
%i = icmp ult %x, C3
%y = add %A, C2
%j = icmp ult %y, C3
%r = or %i, %j
  =>
%and = and %A, ~(C1^C2)
%lhs = add %and, umax(C1,C2)
%r = icmp ult %lhs, C3


Nuno


On Wednesday 13 August 2014 12:59:30 Yi Jiang wrote:
> Ping ...
> 
> On Aug 11, 2014, at 10:19 AM, Yi Jiang <yjiang at apple.com> wrote:
> > Hi Nuno,
> > Yes. it does. We have already extracted the test case to reflect the
> > pattern we want to catch most and will go with the patch. We believe the
> > most useful case is to optimize: ((c) >= 'A' && (c) <= 'Z') || ((c) >=
> > 'a' && (c) <= 'z’).  Thank you.
> > 
> > -Yi
> > 
> > On Aug 11, 2014, at 9:17 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
> >> AFAICT, correctness wise the patch looks good. I've translated your
> >> optimization into Alive by hand and now it seems good. Now I wonder
> >> whether the optimization still kicks in on your benchmarks? (given that
> >> now the constraints are much thigher)
> >> 
> >> Nuno
> >> 
> >> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
> >> Sent: Thursday, August 07, 2014 8:37 AM
> >> Subject: Re: [Patch]New InstCombine pattern for Icmp
> >> 
> >>> With the help of Nuno and the Alive tool, I figured out the new
> >>> condition. The original condition actually refers to the following two
> >>> ranges: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) ==
> >>> [MAX_UINT-C1+1, MAX_UINT-C1+1+C3] and [MAX_UINT-C2+1, MAX_UINT-C2+1+C3]
> >>> We can fold these two range sif:
> >>> 1) C1 and C2 is unsigned greater than C3
> >>> 2) The two ranges are separated.  abs(LowRange1-LowRange2) > C3
> >>> 3) C1 ^ C2 is one-bit mask.
> >>> 4) LowRange1 ^ LowRange2 and HighRange1 ^ HighRange2 are one-bit mask.
> >>> 
> >>> Here is the new patch:
> >>> 
> >>> On Jul 30, 2014, at 2:24 PM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
> >>>> Hi Yi,
> >>>> 
> >>>> ALIVe still complains, I'm afraid:
> >>>> 
> >>>> Precondition: isPowerOf2(C1 ^ C2) && ugt(C1, C3) && ugt(C2, C3)
> >>>> %x = add %A, C1
> >>>> %i = icmp ult %x, C3
> >>>> %y = add %A, C2
> >>>> %j = icmp ult %y, C3
> >>>> %r = or %i, %j
> >>>> =>
> >>>> %and = and %A, ~(C1 ^ C2)
> >>>> %lhs = add %and, umax(C1, C2)
> >>>> %r = icmp ult %lhs, C3
> >>>> 
> >>>> Done: 1
> >>>> ERROR: Mismatch in values of i1 %r
> >>>> 
> >>>> Example:
> >>>> %A i2 = 2 (0x2)
> >>>> C1 i2 = 3 (0x3)
> >>>> %x i2 = 1 (0x1)
> >>>> C3 i2 = 1 (0x1)
> >>>> %i i1 = 0 (0x0)
> >>>> C2 i2 = 2 (0x2)
> >>>> %y i2 = 0 (0x0)
> >>>> %j i1 = 1 (0x1)
> >>>> %and i2 = 2 (0x2)
> >>>> %lhs i2 = 1 (0x1)
> >>>> Source value: 1 (0x1)
> >>>> Target value: 0 (0x0)
> >>>> 
> >>>> 
> >>>> Nuno
> >>>> 
> >>>> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
> >>>> To: "Nuno Lopes" <nuno.lopes at ist.utl.pt>
> >>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
> >>>> Sent: Wednesday, July 30, 2014 8:28 PM
> >>>> Subject: Re: [Patch]New InstCombine pattern for Icmp
> >>>> 
> >>>> 
> >>>> Hi Nuno,
> >>>> 
> >>>> Thank you for pointing out this! One condition is missing: C1 and C2
> >>>> should be unsigned greater than C3. Here is the new patch.
> >>>> 
> >>>> -Yi
> >>>> 
> >>>> On Jul 29, 2014, at 11:34 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
> >>>>> Hi,
> >>>>> 
> >>>>> I sent you an email the other day, but it was rejected by Apple's
> >>>>> email server.  Tyring now through another email account. Please take
> >>>>> a look here:
> >>>>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140721/
> >>>>> 227575.html
> >>>>> 
> >>>>> Nuno
> >>>>> 
> >>>>> ----- Original Message -----
> >>>>> 
> >>>>> Ping...
> >>>>> 
> >>>>> On Jul 25, 2014, at 4:11 PM, Yi Jiang <yjiang at apple.com> wrote:
> >>>>>> Hi Ben,
> >>>>>> 
> >>>>>> Thank you for your comments! Here is a new version.  In this version,
> >>>>>> I still kept some "if" thinking that some operand has been already
> >>>>>> extracted so I would like to use it directly.  Please let me know if
> >>>>>> any other comments. <combinecmpV2.patch>
> >>>>>> -Yi
> >>>>>> 
> >>>>>> On Jul 25, 2014, at 2:12 AM, Benjamin Kramer <benny.kra at gmail.com> 
wrote:
> >>>>>>> On Fri, Jul 25, 2014 at 1:24 AM, Yi Jiang <yjiang at apple.com> wrote:
> >>>>>>>> Hi,
> >>>>>>>> 
> >>>>>>>> his patch is trying to fold (icmp ult/ule (A + C1), C3) | (icmp
> >>>>>>>> ult/ule (A + C2), C3)  to (icmp ult/ule ((A & ~(C1 ^ C2)) +
> >>>>>>>> max(C1, C2)), C3)  . This transformation is legal if C1 ^ C2 is
> >>>>>>>> one-bit mask, in other word, C1 is only one bit different from C2.
> >>>>>>>> In this case, we can "mask" that bit and do just one comparison.
> >>>>>>>> A typical example is:
> >>>>>>>> isALPHA(c)    ((c) >= 'A' && (c) <= 'Z') || ((c) >= 'a' && (c) <=
> >>>>>>>> 'z')
> >>>>>>>> 
> >>>>>>>> Now llvm will optimize it to ((c + 191) <=25) || ((c + 159) <=25)
> >>>>>>>> 
> >>>>>>>> With this patch, we can optimize it further to:
> >>>>>>>> (c & 223) + 191 <= 25
> >>>>>>>> 
> >>>>>>>> The binary format of the constants are:
> >>>>>>>> 191   10111111
> >>>>>>>> 159   10011111
> >>>>>>>> 223   11011111
> >>>>>>>> 
> >>>>>>>> Here is some experiment result on arm64:
> >>>>>>>> The patch shows no regression and improve spec2000 perlbmk 3.8% in
> >>>>>>>> test-suite under -O3.
> >>>>>>>> We also test the spec2006 400.perlbench with ref size input, it
> >>>>>>>> will improve 1% under -O3 and 1.2% under -O3+lto+pgo.
> >>>>>>> 
> >>>>>>> Wow, very nice.
> >>>>>>> 
> >>>>>>>> Any comments are appreciated.
> >>>>>>> 
> >>>>>>> The patch could be significantly simplified with the PatternMatch
> >>>>>>> tool. Something like match(LHS, m_Add(m_OneUse(m_Value(A),
> >>>>>>> m_ConstantInt(LAddCst)... could replace your if chain.
> >>>>>>> 
> >>>>>>> - Ben




More information about the llvm-commits mailing list