[PATCH] D74484: [AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand.

Sun Apr 19 01:34:53 PDT 2020

aymanmus added a comment.

In D74484#1980606 <https://reviews.llvm.org/D74484#1980606>, @spatel wrote:

> Sorry, I did not explain my previous comment as well as possible. Let's take this minimal example:
>
>   define i1 @f(i8 %a) {
>     %conv = sext i8 %a to i32
>     %cmp = icmp slt i32 %conv, 109
>     ret i1 %cmp
>   }
>   

Sorry, but I think the motivation of the patch is not yet clear.
The whole optimization (we're trying to improve) should be triggered by a truncate instruction that dominates a certain sequence.
This trunc can lead to shrinking a whole sequence of instructions to a new type and by that leave no need for the trunc instruction itself.
The example you put here has no trunc at all so it's not really relevant.

> And even with an extra use of the sext, we would get the narrow icmp/select:
> 
>   define i16 @cmp_select_sext_const(i8 %a) {
>     %conv = sext i8 %a to i32
>     call void @use(i32 %conv)
>     %1 = icmp sgt i8 %a, 109
>     %narrow = select i1 %1, i8 %a, i8 109
>     %conv4 = zext i8 %narrow to i16
>     ret i16 %conv4
>   }

Same here.

> In fact, even when the select also has an extra use (you can reproduce this on current LLVM: https://godbolt.org/z/P6Kws_ ):
> 
>   define i16 @cmp_select_sext_const(i8 %a) {
>     %conv = sext i8 %a to i32
>     call void @use(i32 %conv)
>     %cmp = icmp slt i8 %a, 109
>     %cond = select i1 %cmp, i32 109, i32 %conv
>     call void @use(i32 %cond)
>     %conv4 = trunc i32 %cond to i16
>     ret i16 %conv4
>   }
>    

This example indeed has a truncate instruction, but as we can see, the optimization applied to it did not originate from the truncate.
As a matter of fact, it applied despite of it. We can see that shrinking the select's type actually increased the instruction count.
So this is not a relevant example as well.
What we are trying to solve in this patch, are cases where a full sequence which is terminated with a truncate instruction can be shrunk to a new type, that will subsequently remove the need of a truncate instruction as a terminator.
A perfect example would be:

  define dso_local i16 @cmp_select_sext(i8 %a, i8 %b) {
  entry:
    %conv = sext i8 %a to i32
    %conv2 = sext i8 %b to i32
    %cmp = icmp slt i32 %conv, %conv2
    %cond = select i1 %cmp, i32 %conv2, i32 %conv
    %conv4 = trunc i32 %cond to i16
    ret i16 %conv4
  }

which will not be changed with inst-combine, but with the new optimization, the whole sequence can be shrunk to i16 type (according to the destination type of the truncate instruction), and the truncate can be thrown away.

> Also note that we have floated the idea again of adding min/max intrinsics to IR in D68408 <https://reviews.llvm.org/D68408>. Obviously, we need to post that as a proposal on llvm-dev for wider review...but if we had those intrinsics in IR, it probably affects at least some of the motivation for this patch, so I'd be interested in your thoughts on that.

Actually, this might help in some cases where the icmp+select implement a max/min semantics. In that case simply add the new nodes' opcodes (min/max) to the list of opcodes we deal with in the optimization.
But, we still need to handle the general case of icmp+select as well.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74484/new/

https://reviews.llvm.org/D74484