[PATCH] D67800: [InstCombine] Fold a shifty implementation of clamp (e.g., clamp255).

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Sep 21 11:24:38 PDT 2019


spatel added a comment.

In D67800#1677674 <https://reviews.llvm.org/D67800#1677674>, @huihuiz wrote:

> clamp255:                               # @clamp255
>
>   cmpl    $256, %edi              # imm = 0x100
>   movl    $255, %eax
>   cmovll  %edi, %eax
>   movzbl  %al, %eax
>   retq


This is not a clear win vs. what we had before. Check theoretical perf with llvm-mca using at least 1 Intel CPU and 1 AMD CPU. Similarly for AArch64 - do we expect 'csel' to have same latency/throughput as simple ALU ops on a variety of micro-architectures?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67800/new/

https://reviews.llvm.org/D67800





More information about the llvm-commits mailing list