[llvm-dev] [IR canonicalization] 6 ways to choose {-1,0,1}

Tue Jul 11 16:44:14 PDT 2017

On Tue, Jul 11, 2017 at 2:08 PM, David Majnemer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> IMO, canonicalization is driven by a few priorities:
> - How easy is it for the rest of the optimizer to reason about the IR
> - How easy is it to lower the IR to "The Right Thing (TM)"
>
> I think it is hard to know which is best without resorting to a
> case-by-case analysis, mostly because of stuff like ComputeKnownBits.
>
> Outside a few special cases, ComputeKnownBits isn't very smart when it
> comes to select: it just takes the intersection of the two possibilities.
> This can lose quite a bit of information.
>

This is essentially unfixable without a real bit range analysis.

>
> I'd feel a lot better about always canonicalizing to select if the
> backends are strengthened to lower those selects to something good (without
> relying on CMOV, etc.) and if ComputeKnownBits dealt with select better.
>

The latter is going to take real design and work :)

>
> On Tue, Jul 11, 2017 at 2:00 PM, Nuno Lopes via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> FWIW: it would be nice to start canonicalizing more on select.  Some of
>> the inscombine rewrites we have at the moment canonicalize to arithmetic
>> instead, and it is very hard to justify the correctness of these (to say
>> the least).
>> So canonicalizing more stuff on select would give us critical mass to
>> remove all the remaining canonicalizations to arithmetic.
>>
>> Thanks,
>> Nuno
>>
>>
>> -----Original Message----- From: Sanjay Patel via llvm-dev
>> Sent: Saturday, July 1, 2017 7:45 PM
>> To: llvm-dev
>> Subject: [llvm-dev] [IR canonicalization] 6 ways to choose {-1,0,1}
>>
>>
>>
>> I'm looking at the output of memcmp() expansion (D34904), and I noticed
>> that there are many ways to produce the common positive/zero/negative
>> comparison result in IR.
>>
>> For the following 6 functionally equivalent C source functions, we
>> produce 6 different versions of IR which leads to 6 different asm outputs
>> for x86. Which of these should we choose as canonical IR form?
>>
>> 1. Two selects
>> int zero_negone_one(int x, int y) {
>>  if (x == y) return 0;
>>  if (x < y) return -1;
>>  return 1;
>> }
>>
>> define i32 @zero_negone_one(i32, i32) {
>>  %3 = icmp eq i32 %0, %1
>>  %4 = icmp slt i32 %0, %1
>>  %5 = select i1 %4, i32 -1, i32 1
>>  %6 = select i1 %3, i32 0, i32 %5
>>  ret i32 %6
>> }
>>
>>
>> 2. Two selects, but different
>> int zero_one_negone(int x, int y) {
>>  if (x == y) return 0;
>>  if (x > y) return 1;
>>  return -1;
>> }
>>
>> define i32 @zero_one_negone(i32, i32) {
>>  %3 = icmp eq i32 %0, %1
>>  %4 = icmp sgt i32 %0, %1
>>  %5 = select i1 %4, i32 1, i32 -1
>>  %6 = select i1 %3, i32 0, i32 %5
>>  ret i32 %6
>> }
>>
>>
>> 3. Select and zext
>> int negone_one_zero(int x, int y) {
>>  if (x < y) return -1;
>>  if (x > y) return 1;
>>  return 0;
>> }
>>
>> define i32 @negone_one_zero(i32, i32)  {
>>  %3 = icmp slt i32 %0, %1
>>  %4 = icmp sgt i32 %0, %1
>>  %5 = zext i1 %4 to i32
>>  %6 = select i1 %3, i32 -1, i32 %5
>>  ret i32 %6
>> }
>>
>>
>> 4. Select and sext
>> int negone_zero_one(int x, int y) {
>>  int sel = x < y ? -1 : 0;
>>  if (x > y) return 1;
>>  return sel;
>> }
>>
>> define i32 @negone_zero_one(i32, i32) {
>>  %3 = icmp sgt i32 %0, %1
>>  %4 = icmp slt i32 %0, %1
>>  %5 = sext i1 %4 to i32
>>  %6 = select i1 %3, i32 1, i32 %5
>>  ret i32 %6
>> }
>>
>>
>> 5. Subs and shifts
>> int neg101_sub_shifty(int x, int y) {
>>  int r = (x - y) >> 31;
>>  r += (unsigned)(y - x) >> 31;
>>  return r;
>> }
>>
>> define i32 @neg101_sub_shifty(i32, i32) {
>>  %3 = sub nsw i32 %0, %1
>>  %4 = ashr i32 %3, 31
>>  %5 = sub nsw i32 %1, %0
>>  %6 = lshr i32 %5, 31
>>  %7 = add nsw i32 %4, %6
>>  ret i32 %7
>> }
>>
>>
>> 6. Zexts and sub
>> int neg101_cmp_sub(int x, int y) {
>>  return (x>y) - (x<y);
>> }
>>
>> define i32 @neg101_cmp_sub(i32, i32) {
>>  %3 = icmp sgt i32 %0, %1
>>  %4 = zext i1 %3 to i32
>>  %5 = icmp slt i32 %0, %1
>>  %6 = zext i1 %5 to i32
>>  %7 = sub nsw i32 %4, %6
>>  ret i32 %7
>> }
>>
>>
>> https://godbolt.org/g/UnM9H7
>>
>> Show these are logically equivalent:
>> http://rise4fun.com/Alive/b4D
>>
>> Recent patch related to this pattern:
>> https://reviews.llvm.org/D34278
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170711/691144b3/attachment.html>