[PATCH] D25987: [X86] New pattern to generate PSUBUS from SELECT

Fri May 12 01:26:04 PDT 2017

yulia_koval added a comment.

In https://reviews.llvm.org/D25987#743482, @spatel wrote:

> But do you agree that it is reasonable to write this function in C as:
>
>   void goo(unsigned short *p, int max, int n) {
>     int i;
>     unsigned m;
>     for (i = 0; i < n; i++) {
>       m = *--p;
>       unsigned umax = m > max ? m : max;
>       *p = (unsigned short)(umax - max);
>     }
>   }

This function's IR is

  %10 = zext <8 x i16> %reverse to <8 x i32>
  %11 = icmp ult <8 x i32> %broadcast.splat18, %10
  %12 = select <8 x i1> %11, <8 x i32> %10, <8 x i32> %broadcast.splat18
  %13 = sub <8 x i32> %12, %broadcast.splat18
  %14 = trunc <8 x i32> %13 to <8 x i16>

I agree, that it looks better as canonical, then my example, when trunc is in between the pattern's instructions. What do you think about defining trunc(sub(select())) canonical for this pattern? Then we can canonicalize my initial pattern using the InstCombine patch I attached(https://reviews.llvm.org/D33118) - it looks like this way the .ll code generated for both of them would be the same. After this transformation, the initial pattern can be transformed into 32bit sub(max). I haven't yet investigated how to transform it into psubus, but at least max is more profitable then currently generated instructions.

What do you think about this idea?

https://reviews.llvm.org/D25987