[Patch]New InstCombine pattern for Icmp

David Majnemer david.majnemer at gmail.com
Wed Aug 13 16:44:51 PDT 2014


Hi Yi,

Both LAdd and RAdd come from a dyn_cast but are immediately sent to a call
to match.  Are you asserting that the dyn_cast could never fail? If so,
please replace the dyn_casts with cast.

The definition of MaxAddCst can be sinked inside of the 'if (LAddOpnd ==
RAddOpnd)' check.

DiffCst is required to be a power of 2 for this transform to work, I think
it would make sense to perform that check separately and before a lot of
the other APInt math; APInts may hold numbers larger than 2**64, it would
be nice if we could avoid the big-int slow path code where we can. In fact,
it would be nice to compute as little as possible as we go along making
checks.

The line computing RangeDiff is considerably more than 80 columns, can you
please run clang-format on that line?

Finally, please remove the trailing whitespace from your combine.

Please keep in mind our developer policy when pinging patches:
http://llvm.org/docs/DeveloperPolicy.html#code-reviews



On Wed, Aug 13, 2014 at 12:59 PM, Yi Jiang <yjiang at apple.com> wrote:

> Ping ...
> On Aug 11, 2014, at 10:19 AM, Yi Jiang <yjiang at apple.com> wrote:
>
> > Hi Nuno,
> > Yes. it does. We have already extracted the test case to reflect the
> pattern we want to catch most and will go with the patch. We believe the
> most useful case is to optimize:
> > ((c) >= 'A' && (c) <= 'Z') || ((c) >= 'a' && (c) <= 'z’).  Thank you.
> >
> > -Yi
> >
> > On Aug 11, 2014, at 9:17 AM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
> >
> >> AFAICT, correctness wise the patch looks good. I've translated your
> optimization into Alive by hand and now it seems good.
> >> Now I wonder whether the optimization still kicks in on your
> benchmarks? (given that now the constraints are much thigher)
> >>
> >> Nuno
> >>
> >> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
> >> Sent: Thursday, August 07, 2014 8:37 AM
> >> Subject: Re: [Patch]New InstCombine pattern for Icmp
> >>
> >>
> >>> With the help of Nuno and the Alive tool, I figured out the new
> condition. The original condition actually refers to the following two
> ranges:
> >>> (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) ==
> [MAX_UINT-C1+1, MAX_UINT-C1+1+C3] and [MAX_UINT-C2+1, MAX_UINT-C2+1+C3]
> >>> We can fold these two range sif:
> >>> 1) C1 and C2 is unsigned greater than C3
> >>> 2) The two ranges are separated.  abs(LowRange1-LowRange2) > C3
> >>> 3) C1 ^ C2 is one-bit mask.
> >>> 4) LowRange1 ^ LowRange2 and HighRange1 ^ HighRange2 are one-bit mask.
> >>>
> >>> Here is the new patch:
> >>>
> >>>
> >>> On Jul 30, 2014, at 2:24 PM, Nuno Lopes <nuno.lopes at ist.utl.pt> wrote:
> >>>
> >>>> Hi Yi,
> >>>>
> >>>> ALIVe still complains, I'm afraid:
> >>>>
> >>>> Precondition: isPowerOf2(C1 ^ C2) && ugt(C1, C3) && ugt(C2, C3)
> >>>> %x = add %A, C1
> >>>> %i = icmp ult %x, C3
> >>>> %y = add %A, C2
> >>>> %j = icmp ult %y, C3
> >>>> %r = or %i, %j
> >>>> =>
> >>>> %and = and %A, ~(C1 ^ C2)
> >>>> %lhs = add %and, umax(C1, C2)
> >>>> %r = icmp ult %lhs, C3
> >>>>
> >>>> Done: 1
> >>>> ERROR: Mismatch in values of i1 %r
> >>>>
> >>>> Example:
> >>>> %A i2 = 2 (0x2)
> >>>> C1 i2 = 3 (0x3)
> >>>> %x i2 = 1 (0x1)
> >>>> C3 i2 = 1 (0x1)
> >>>> %i i1 = 0 (0x0)
> >>>> C2 i2 = 2 (0x2)
> >>>> %y i2 = 0 (0x0)
> >>>> %j i1 = 1 (0x1)
> >>>> %and i2 = 2 (0x2)
> >>>> %lhs i2 = 1 (0x1)
> >>>> Source value: 1 (0x1)
> >>>> Target value: 0 (0x0)
> >>>>
> >>>>
> >>>> Nuno
> >>>>
> >>>> ----- Original Message ----- From: "Yi Jiang" <yjiang at apple.com>
> >>>> To: "Nuno Lopes" <nuno.lopes at ist.utl.pt>
> >>>> Cc: "LLVM Commits" <llvm-commits at cs.uiuc.edu>
> >>>> Sent: Wednesday, July 30, 2014 8:28 PM
> >>>> Subject: Re: [Patch]New InstCombine pattern for Icmp
> >>>>
> >>>>
> >>>> Hi Nuno,
> >>>>
> >>>> Thank you for pointing out this! One condition is missing: C1 and C2
> should be unsigned greater than C3.
> >>>> Here is the new patch.
> >>>>
> >>>> -Yi
> >>>>
> >>>> On Jul 29, 2014, at 11:34 AM, Nuno Lopes <nuno.lopes at ist.utl.pt>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I sent you an email the other day, but it was rejected by Apple's
> email server.  Tyring now through another email account.
> >>>>> Please take a look here:
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140721/227575.html
> >>>>>
> >>>>> Nuno
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>
> >>>>> Ping...
> >>>>> On Jul 25, 2014, at 4:11 PM, Yi Jiang <yjiang at apple.com> wrote:
> >>>>>
> >>>>>> Hi Ben,
> >>>>>>
> >>>>>> Thank you for your comments! Here is a new version.  In this
> version, I still kept some "if" thinking that some operand has been already
> extracted so I would like to use it directly.  Please let me know if any
> other comments.
> >>>>>> <combinecmpV2.patch>
> >>>>>> -Yi
> >>>>>> On Jul 25, 2014, at 2:12 AM, Benjamin Kramer <benny.kra at gmail.com>
> wrote:
> >>>>>>
> >>>>>>> On Fri, Jul 25, 2014 at 1:24 AM, Yi Jiang <yjiang at apple.com>
> wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> his patch is trying to fold (icmp ult/ule (A + C1), C3) | (icmp
> ult/ule (A +
> >>>>>>>> C2), C3)  to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3)
>  .
> >>>>>>>> This transformation is legal if C1 ^ C2 is one-bit mask, in other
> word, C1
> >>>>>>>> is only one bit different from C2. In this case, we can "mask"
> that bit and
> >>>>>>>> do just one comparison.
> >>>>>>>> A typical example is:
> >>>>>>>> isALPHA(c)    ((c) >= 'A' && (c) <= 'Z') || ((c) >= 'a' && (c) <=
> 'z')
> >>>>>>>>
> >>>>>>>> Now llvm will optimize it to ((c + 191) <=25) || ((c + 159) <=25)
> >>>>>>>>
> >>>>>>>> With this patch, we can optimize it further to:
> >>>>>>>> (c & 223) + 191 <= 25
> >>>>>>>>
> >>>>>>>> The binary format of the constants are:
> >>>>>>>> 191   10111111
> >>>>>>>> 159   10011111
> >>>>>>>> 223   11011111
> >>>>>>>>
> >>>>>>>> Here is some experiment result on arm64:
> >>>>>>>> The patch shows no regression and improve spec2000 perlbmk 3.8% in
> >>>>>>>> test-suite under -O3.
> >>>>>>>> We also test the spec2006 400.perlbench with ref size input, it
> will improve
> >>>>>>>> 1% under -O3 and 1.2% under -O3+lto+pgo.
> >>>>>>>
> >>>>>>> Wow, very nice.
> >>>>>>>
> >>>>>>>> Any comments are appreciated.
> >>>>>>>
> >>>>>>> The patch could be significantly simplified with the PatternMatch
> >>>>>>> tool. Something like match(LHS, m_Add(m_OneUse(m_Value(A),
> >>>>>>> m_ConstantInt(LAddCst)... could replace your if chain.
> >>>>>>>
> >>>>>>> - Ben
> >>
> >
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140813/c1d8c57c/attachment.html>


More information about the llvm-commits mailing list