[PATCH] D67799: [InstCombine] Fold a shifty implementation of clamp0.

Thu Sep 19 23:34:19 PDT 2019

huihuiz added a comment.

E.g., vmax generation for ARM target

test.c

  static __inline int clamp0(int v) {
    return ((-(v) >> 31) & (v));
  }

  void foo(const unsigned char* src0,
           const unsigned char* src1,
           unsigned char* dst,
                         int width) {
    int i;
    for (i = 0; i < width; ++i) {
      const int b = src0[0];
      const int b_sub = src1[0];
      dst[0] = clamp0(b - b_sub);
      src0 ++;
      src1 ++;
      dst ++;
    }
  }

run : clang -cc1  -triple armv8.1a-linux-gnu -target-abi apcs-gnu -target-feature +neon -vectorize-loops -vectorize-slp -O2 -S -o - test-clamp0.c -o -
you can see "vmax" optimization

before this optimization, generate "vneg + vshr + vand" instead.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67799/new/

https://reviews.llvm.org/D67799