[llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

Maxim Kazantsev via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 5 21:33:41 PDT 2018


Hi Eli,

It seems this patch has introduced a number of problems for cases which were not in unit tests. I've reverted it as https://reviews.llvm.org/rL336410 till we investigate them. If you have tests that demonstrate problems with this patch, please commit these tests as NFC with checks on current behavior. We will need them for re-enabling.

Thanks,
Max

-----Original Message-----
From: Friedman, Eli [mailto:efriedma at codeaurora.org] 
Sent: Friday, July 6, 2018 9:45 AM
To: Maxim Kazantsev <max.kazantsev at azul.com>; llvm-commits at lists.llvm.org
Subject: Re: [llvm] r336172 - [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done

On 7/2/2018 11:23 PM, Max Kazantsev via llvm-commits wrote:
> Author: mkazantsev
> Date: Mon Jul  2 23:23:57 2018
> New Revision: 336172
>
> URL: http://llvm.org/viewvc/llvm-project?rev=336172&view=rev
> Log:
> [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are 
> done
>
> This patch changes order of transform in InstCombineCompares to avoid 
> performing transforms based on ranges which produce complex bit 
> arithmetics before more simple things (like folding with constants) 
> are done. See PR37636 for the motivating example.
>
> Differential Revision: https://reviews.llvm.org/D48584 Reviewed By: 
> spatel, lebedev.ri

This patch seems to be leading to worse generated code in certain cases because it prefers ugt over eq in cases where they're interchangeable. 
I'm seeing diffs like the following:

         %40 = lshr i8 %39, 1
     >   %cmp8.i = icmp ugt i8 %39, 1, !dbg !53520
     >   %narrow.i = select i1 %cmp8.i, i8 %40, i8 1, !dbg !53521
     <   %cmp8.i = icmp eq i8 %40, 0, !dbg !53520
     <   %narrow.i = select i1 %cmp8.i, i8 1, i8 %40, !dbg !53521

On Thumb1, leads to an extra "cmp", instead of using flags produced by lsrs.

     >   %bf.cast1182.i = and i88 %bf.load115.i, 71776119061217280, !dbg
!53502
     >   %cmp119.i = icmp ugt i88 %bf.cast1182.i, 281474976710656, !dbg
!53502
     <   %bf.cast1182.i = and i88 %bf.load115.i, 71494644084506624, !dbg
!53502
     <   %cmp119.i = icmp eq i88 %bf.cast1182.i, 0, !dbg !53502
         %39 = load i64, i64* %11, align 8, !dbg !53503
     >   br i1 %cmp119.i, label %if.then121.i, label %if.end132.i, !dbg
!53504
     <   br i1 %cmp119.i, label %if.end132.i, label %if.then121.i, !dbg
!53504

On Thumb1, leads to materializing multiple expensive immediates, instead of a couple simple shifts.

Alive proof the two forms are equivalent: https://rise4fun.com/Alive/cEq

-Eli

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project


More information about the llvm-commits mailing list