[LLVMdev] [RFC] Integer Saturation Intrinsics

Thu Jan 15 02:33:53 PST 2015

A couple of questions:

1) Should this really be an intrinsic and not a flag on add?  The add instruction already allows overflow to be either undefined or defined to wrap.  Making it defined to saturate seems a natural extension.

2) How do you imagine this being used and what are the guarantees for sequences of operations with respect to optimisation?  If I do a+b-c (or +c where c is negative), and a+b would saturate, but a+(b-c) would not, then is it allowed for an optimiser to generate the second rather than the first?  If it's an intrinsic that's opaque to optimisers, then that's not a problem for correctness, but then you'll miss some potentially beneficial optimisations.

David

> On 14 Jan 2015, at 22:08, Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote:
> 
> Hi all,
> 
> The patches linked below introduce a new family of intrinsics, for
> integer saturation: @llvm.usat, and @llvm.ssat (unsigned/signed).
> Quoting the added documentation:
> 
>      %r = call i32 @llvm.ssat.i32(i32 %x, i32 %n)
> 
> is equivalent to the expression min(max(x, -2^(n-1)), 2^(n-1)-1), itself
> implementable as the following IR:
> 
>      %min_sint_n = i32 ... ; the min. signed integer of bitwidth n, -2^(n-1)
>      %max_sint_n = i32 ... ; the max. signed integer of bitwidth n, 2^(n-1)-1
>      %0 = icmp slt i32 %x, %min_sint_n
>      %1 = select i1 %0, i32 %min_sint_n, i32 %x
>      %2 = icmp sgt i32 %1, %max_sint_n
>      %r = select i1 %2, i32 %max_sint_n, i32 %1
> 
> 
> As a starting point, here are two patches:
> - http://reviews.llvm.org/D6976  Add Integer Saturation Intrinsics.
> - http://reviews.llvm.org/D6977  [CodeGen] Add legalization for
> Integer Saturation Intrinsics.
> 
> From there, we can generate several new instructions, more efficient
> than their expanded counterpart.  Locally, I have worked on:
> - ARM: the SSAT/USAT instructions (scalar)
> - AArch64: the SQ/UQ ADD/SUB AArch64 instructions (vector/scalar
> saturating arithmetic)
> - X86: PACK SS/US (vector, saturate+truncate)
> - X86: PADD/SUB S/US (vector, saturating arithmetic)
> 
> Anyway, let's first agree on the intrinsics, so that further
> development is done on trunk.
> 
> Thanks!
> -Ahmed
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev