[PATCH] D107148: [InstCombine] Fold two-value clamp patterns

Mon Nov 1 10:28:48 PDT 2021

spatel added a comment.

In D107148#3099856 <https://reviews.llvm.org/D107148#3099856>, @qiucf wrote:

> In D107148#3081328 <https://reviews.llvm.org/D107148#3081328>, @spatel wrote:
>
>> Not with the direction - as noted earlier, we're already trying this with the intrinsics, so doing it with cmp+select just makes things consistent. There are a few implementation/test questions:
>>
>> 1. What logic diffs are there between this and 025bb5290379 <https://reviews.llvm.org/rG025bb52903792de3dd29667d42c97fdf13a00f2b> ?
>> 2. Add/adjust tests based on those diffs.
>> 3. Use m_APInt so we get splat vectors.
>>
>> @qiucf - will you continue this patch soon?
>
> I applied D98152 <https://reviews.llvm.org/D98152> and try. The two currently affected cases (minmax-fold.ll, icmp-dom.ll) here are still passed with it but without my patch. But for below vector case:
>
>   define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
>   entry:
>     %cmp1 = icmp sgt <4 x i32> %num, <i32 13767, i32 13767, i32 13767, i32 13767>
>     %s1 = select <4 x i1> %cmp1, <4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>
>     %cmp2 = icmp slt <4 x i32> %s1, <i32 13768, i32 13768, i32 13768, i32 13768>
>     %r = select <4 x i1> %cmp2, <4 x i32> %s1, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
>     ret <4 x i32> %r
>   }
>   
>   ; Got
>   define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
>   entry:
>     %0 = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>)
>     %1 = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %0, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>)
>     ret <4 x i32> %1
>   }
>   
>   ; Not
>   define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
>   entry:
>     %0 = icmp slt <4 x i32> %num, <i32 13768, i32 13768, i32 13768, i32 13768>
>     %r = select <4 x i1> %0, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
>     ret <4 x i32> %r
>   }
>
> Seems not covered by that?

We may need some generalization for mismatched signedness of min/max like this:
https://alive2.llvm.org/ce/z/NV_bKr
(That should translate for all 4 combinations for min/max?)
After that, we will recognize the special case for clamp of 2 values.

But that doesn't need to hold this patch up unless I'm misunderstanding the comment.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107148/new/

https://reviews.llvm.org/D107148