[PATCH] D67799: [InstCombine] Fold a shifty implementation of clamp0.
Huihui Zhang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 19 23:34:19 PDT 2019
huihuiz added a comment.
E.g., vmax generation for ARM target
test.c
static __inline int clamp0(int v) {
return ((-(v) >> 31) & (v));
}
void foo(const unsigned char* src0,
const unsigned char* src1,
unsigned char* dst,
int width) {
int i;
for (i = 0; i < width; ++i) {
const int b = src0[0];
const int b_sub = src1[0];
dst[0] = clamp0(b - b_sub);
src0 ++;
src1 ++;
dst ++;
}
}
run : clang -cc1 -triple armv8.1a-linux-gnu -target-abi apcs-gnu -target-feature +neon -vectorize-loops -vectorize-slp -O2 -S -o - test-clamp0.c -o -
you can see "vmax" optimization
before this optimization, generate "vneg + vshr + vand" instead.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D67799/new/
https://reviews.llvm.org/D67799
More information about the llvm-commits
mailing list