[Patch]New InstCombine pattern for Icmp

Nuno Lopes nuno.lopes at ist.utl.pt
Wed Aug 20 07:04:22 PDT 2014


LGTM.
Nuno

-----Original Message----- 
From: Yi Jiang
Sent: Monday, August 18, 2014 8:01 PM
Subject: Re: [Patch]New InstCombine pattern for Icmp

Hi Gerolf, Chandler,David and Nuno,

Thank you for all your comments and sorry that I did not check the Ping 
policy before. Here is the new patch addressing the concerns:


-Yi
On Aug 14, 2014, at 2:43 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:

> BTW, just one more comment regarding the following lines:
> +        APInt RRangeLow = UMaxValue - RAddCst->getValue() + 1;
> +        APInt LRangeLow = UMaxValue - LAddCst->getValue() + 1;
>
> We have that UMaxValue + 1 == 0. Therefore those lines can be simplified 
> to:
> APInt RRangeLow = - RAddCst->getValue();
> APInt LRangeLow = - LAddCst->getValue();
>
>
> The full spec in Alive is:
>
> Pre: C1 u> C3 && C2 u> C3 && isPowerOf2(C1 ^ C2) && isPowerOf2(-C1 ^ -C2) 
> &&
> (-C1 ^ -C2) == ((C3-C1) ^ (C3-C2)) && abs(C1-C2) u> C3
> %x = add %A, C1
> %i = icmp ult %x, C3
> %y = add %A, C2
> %j = icmp ult %y, C3
> %r = or %i, %j
>  =>
> %and = and %A, ~(C1^C2)
> %lhs = add %and, umax(C1,C2)
> %r = icmp ult %lhs, C3
>
>
> Nuno
>
>
> On Wednesday 13 August 2014 12:59:30 Yi Jiang wrote:
>> Ping ...
>>
>> On Aug 11, 2014, at 10:19 AM, Yi Jiang <yjiang at apple.com> wrote:
>>> Hi Nuno,
>>> Yes. it does. We have already extracted the test case to reflect the
>>> pattern we want to catch most and will go with the patch. We believe the
>>> most useful case is to optimize: ((c) >= 'A' && (c) <= 'Z') || ((c) >=
>>> 'a' && (c) <= 'z’).  Thank you.
>>>
>>> -Yi
>>>
>>> On Aug 11, 2014, at 9:17 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
>>>> AFAICT, correctness wise the patch looks good. I've translated your
>>>> optimization into Alive by hand and now it seems good. Now I wonder
>>>> whether the optimization still kicks in on your benchmarks? (given that
>>>> now the constraints are much thigher)
>>>>
>>>> Nuno
>>>>
>>>> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
>>>> Sent: Thursday, August 07, 2014 8:37 AM
>>>> Subject: Re: [Patch]New InstCombine pattern for Icmp
>>>>
>>>>> With the help of Nuno and the Alive tool, I figured out the new
>>>>> condition. The original condition actually refers to the following two
>>>>> ranges: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) ==
>>>>> [MAX_UINT-C1+1, MAX_UINT-C1+1+C3] and [MAX_UINT-C2+1, 
>>>>> MAX_UINT-C2+1+C3]
>>>>> We can fold these two range sif:
>>>>> 1) C1 and C2 is unsigned greater than C3
>>>>> 2) The two ranges are separated.  abs(LowRange1-LowRange2) > C3
>>>>> 3) C1 ^ C2 is one-bit mask.
>>>>> 4) LowRange1 ^ LowRange2 and HighRange1 ^ HighRange2 are one-bit mask.
>>>>>
>>>>> Here is the new patch:
>>>>>
>>>>> On Jul 30, 2014, at 2:24 PM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
>>>>>> Hi Yi,
>>>>>>
>>>>>> ALIVe still complains, I'm afraid:
>>>>>>
>>>>>> Precondition: isPowerOf2(C1 ^ C2) && ugt(C1, C3) && ugt(C2, C3)
>>>>>> %x = add %A, C1
>>>>>> %i = icmp ult %x, C3
>>>>>> %y = add %A, C2
>>>>>> %j = icmp ult %y, C3
>>>>>> %r = or %i, %j
>>>>>> =>
>>>>>> %and = and %A, ~(C1 ^ C2)
>>>>>> %lhs = add %and, umax(C1, C2)
>>>>>> %r = icmp ult %lhs, C3
>>>>>>
>>>>>> Done: 1
>>>>>> ERROR: Mismatch in values of i1 %r
>>>>>>
>>>>>> Example:
>>>>>> %A i2 = 2 (0x2)
>>>>>> C1 i2 = 3 (0x3)
>>>>>> %x i2 = 1 (0x1)
>>>>>> C3 i2 = 1 (0x1)
>>>>>> %i i1 = 0 (0x0)
>>>>>> C2 i2 = 2 (0x2)
>>>>>> %y i2 = 0 (0x0)
>>>>>> %j i1 = 1 (0x1)
>>>>>> %and i2 = 2 (0x2)
>>>>>> %lhs i2 = 1 (0x1)
>>>>>> Source value: 1 (0x1)
>>>>>> Target value: 0 (0x0)
>>>>>>
>>>>>>
>>>>>> Nuno
>>>>>>
>>>>>> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
>>>>>> To: "Nuno Lopes" <nuno.lopes at ist.utl.pt>
>>>>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
>>>>>> Sent: Wednesday, July 30, 2014 8:28 PM
>>>>>> Subject: Re: [Patch]New InstCombine pattern for Icmp
>>>>>>
>>>>>>
>>>>>> Hi Nuno,
>>>>>>
>>>>>> Thank you for pointing out this! One condition is missing: C1 and C2
>>>>>> should be unsigned greater than C3. Here is the new patch.
>>>>>>
>>>>>> -Yi
>>>>>>
>>>>>> On Jul 29, 2014, at 11:34 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> 
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I sent you an email the other day, but it was rejected by Apple's
>>>>>>> email server.  Tyring now through another email account. Please take
>>>>>>> a look here:
>>>>>>> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140721/
>>>>>>> 227575.html
>>>>>>>
>>>>>>> Nuno
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>
>>>>>>> Ping...
>>>>>>>
>>>>>>> On Jul 25, 2014, at 4:11 PM, Yi Jiang <yjiang at apple.com> wrote:
>>>>>>>> Hi Ben,
>>>>>>>>
>>>>>>>> Thank you for your comments! Here is a new version.  In this 
>>>>>>>> version,
>>>>>>>> I still kept some "if" thinking that some operand has been already
>>>>>>>> extracted so I would like to use it directly.  Please let me know 
>>>>>>>> if
>>>>>>>> any other comments. <combinecmpV2.patch>
>>>>>>>> -Yi
>>>>>>>>
>>>>>>>> On Jul 25, 2014, at 2:12 AM, Benjamin Kramer <benny.kra at gmail.com>
> wrote:
>>>>>>>>> On Fri, Jul 25, 2014 at 1:24 AM, Yi Jiang <yjiang at apple.com> 
>>>>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> his patch is trying to fold (icmp ult/ule (A + C1), C3) | (icmp
>>>>>>>>>> ult/ule (A + C2), C3)  to (icmp ult/ule ((A & ~(C1 ^ C2)) +
>>>>>>>>>> max(C1, C2)), C3)  . This transformation is legal if C1 ^ C2 is
>>>>>>>>>> one-bit mask, in other word, C1 is only one bit different from 
>>>>>>>>>> C2.
>>>>>>>>>> In this case, we can "mask" that bit and do just one comparison.
>>>>>>>>>> A typical example is:
>>>>>>>>>> isALPHA(c)    ((c) >= 'A' && (c) <= 'Z') || ((c) >= 'a' && (c) <=
>>>>>>>>>> 'z')
>>>>>>>>>>
>>>>>>>>>> Now llvm will optimize it to ((c + 191) <=25) || ((c + 159) <=25)
>>>>>>>>>>
>>>>>>>>>> With this patch, we can optimize it further to:
>>>>>>>>>> (c & 223) + 191 <= 25
>>>>>>>>>>
>>>>>>>>>> The binary format of the constants are:
>>>>>>>>>> 191   10111111
>>>>>>>>>> 159   10011111
>>>>>>>>>> 223   11011111
>>>>>>>>>>
>>>>>>>>>> Here is some experiment result on arm64:
>>>>>>>>>> The patch shows no regression and improve spec2000 perlbmk 3.8% 
>>>>>>>>>> in
>>>>>>>>>> test-suite under -O3.
>>>>>>>>>> We also test the spec2006 400.perlbench with ref size input, it
>>>>>>>>>> will improve 1% under -O3 and 1.2% under -O3+lto+pgo.
>>>>>>>>>
>>>>>>>>> Wow, very nice.
>>>>>>>>>
>>>>>>>>>> Any comments are appreciated.
>>>>>>>>>
>>>>>>>>> The patch could be significantly simplified with the PatternMatch
>>>>>>>>> tool. Something like match(LHS, m_Add(m_OneUse(m_Value(A),
>>>>>>>>> m_ConstantInt(LAddCst)... could replace your if chain.
>>>>>>>>>
>>>>>>>>> - Ben 




More information about the llvm-commits mailing list