[PATCH] D107148: [InstCombine] Fold two-value clamp patterns
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 1 10:28:48 PDT 2021
spatel added a comment.
In D107148#3099856 <https://reviews.llvm.org/D107148#3099856>, @qiucf wrote:
> In D107148#3081328 <https://reviews.llvm.org/D107148#3081328>, @spatel wrote:
>
>> Not with the direction - as noted earlier, we're already trying this with the intrinsics, so doing it with cmp+select just makes things consistent. There are a few implementation/test questions:
>>
>> 1. What logic diffs are there between this and 025bb5290379 <https://reviews.llvm.org/rG025bb52903792de3dd29667d42c97fdf13a00f2b> ?
>> 2. Add/adjust tests based on those diffs.
>> 3. Use m_APInt so we get splat vectors.
>>
>> @qiucf - will you continue this patch soon?
>
> I applied D98152 <https://reviews.llvm.org/D98152> and try. The two currently affected cases (minmax-fold.ll, icmp-dom.ll) here are still passed with it but without my patch. But for below vector case:
>
> define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
> entry:
> %cmp1 = icmp sgt <4 x i32> %num, <i32 13767, i32 13767, i32 13767, i32 13767>
> %s1 = select <4 x i1> %cmp1, <4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>
> %cmp2 = icmp slt <4 x i32> %s1, <i32 13768, i32 13768, i32 13768, i32 13768>
> %r = select <4 x i1> %cmp2, <4 x i32> %s1, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
> ret <4 x i32> %r
> }
>
> ; Got
> define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
> entry:
> %0 = call <4 x i32> @llvm.smax.v4i32(<4 x i32> %num, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>)
> %1 = call <4 x i32> @llvm.umin.v4i32(<4 x i32> %0, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>)
> ret <4 x i32> %1
> }
>
> ; Not
> define <4 x i32> @twoway_clamp_gt(<4 x i32> %num) {
> entry:
> %0 = icmp slt <4 x i32> %num, <i32 13768, i32 13768, i32 13768, i32 13768>
> %r = select <4 x i1> %0, <4 x i32> <i32 13767, i32 13767, i32 13767, i32 13767>, <4 x i32> <i32 13768, i32 13768, i32 13768, i32 13768>
> ret <4 x i32> %r
> }
>
> Seems not covered by that?
We may need some generalization for mismatched signedness of min/max like this:
https://alive2.llvm.org/ce/z/NV_bKr
(That should translate for all 4 combinations for min/max?)
After that, we will recognize the special case for clamp of 2 values.
But that doesn't need to hold this patch up unless I'm misunderstanding the comment.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107148/new/
https://reviews.llvm.org/D107148
More information about the llvm-commits
mailing list