[PATCH] D67800: [InstCombine] Fold a shifty implementation of clamp (e.g., clamp255).
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Sep 21 11:24:38 PDT 2019
spatel added a comment.
In D67800#1677674 <https://reviews.llvm.org/D67800#1677674>, @huihuiz wrote:
> clamp255: # @clamp255
>
> cmpl $256, %edi # imm = 0x100
> movl $255, %eax
> cmovll %edi, %eax
> movzbl %al, %eax
> retq
This is not a clear win vs. what we had before. Check theoretical perf with llvm-mca using at least 1 Intel CPU and 1 AMD CPU. Similarly for AArch64 - do we expect 'csel' to have same latency/throughput as simple ALU ops on a variety of micro-architectures?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D67800/new/
https://reviews.llvm.org/D67800
More information about the llvm-commits
mailing list