[LLVMdev] [RFC] Integer Saturation Intrinsics
David Majnemer
david.majnemer at gmail.com
Thu Jan 15 02:51:55 PST 2015
On Thu, Jan 15, 2015 at 2:33 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk
> wrote:
> A couple of questions:
>
> 1) Should this really be an intrinsic and not a flag on add? The add
> instruction already allows overflow to be either undefined or defined to
> wrap. Making it defined to saturate seems a natural extension.
>
I don't think this should be a flag on add. Flags are designed such that
the middle-end may be ignorant of them and nothing bad might happen, it is
always safe to ignore or drop flags when doing so is convenient (for a
concrete example, take a look at reassociate).
In this case, the saturating nature of the operation does not seem like
something that can be safely ignored.
>
> 2) How do you imagine this being used and what are the guarantees for
> sequences of operations with respect to optimisation? If I do a+b-c (or +c
> where c is negative), and a+b would saturate, but a+(b-c) would not, then
> is it allowed for an optimiser to generate the second rather than the
> first? If it's an intrinsic that's opaque to optimisers, then that's not a
> problem for correctness, but then you'll miss some potentially beneficial
> optimisations.
>
> David
>
> > On 14 Jan 2015, at 22:08, Ahmed Bougacha <ahmed.bougacha at gmail.com>
> wrote:
> >
> > Hi all,
> >
> > The patches linked below introduce a new family of intrinsics, for
> > integer saturation: @llvm.usat, and @llvm.ssat (unsigned/signed).
> > Quoting the added documentation:
> >
> > %r = call i32 @llvm.ssat.i32(i32 %x, i32 %n)
> >
> > is equivalent to the expression min(max(x, -2^(n-1)), 2^(n-1)-1), itself
> > implementable as the following IR:
> >
> > %min_sint_n = i32 ... ; the min. signed integer of bitwidth n,
> -2^(n-1)
> > %max_sint_n = i32 ... ; the max. signed integer of bitwidth n,
> 2^(n-1)-1
> > %0 = icmp slt i32 %x, %min_sint_n
> > %1 = select i1 %0, i32 %min_sint_n, i32 %x
> > %2 = icmp sgt i32 %1, %max_sint_n
> > %r = select i1 %2, i32 %max_sint_n, i32 %1
> >
> >
> > As a starting point, here are two patches:
> > - http://reviews.llvm.org/D6976 Add Integer Saturation Intrinsics.
> > - http://reviews.llvm.org/D6977 [CodeGen] Add legalization for
> > Integer Saturation Intrinsics.
> >
> > From there, we can generate several new instructions, more efficient
> > than their expanded counterpart. Locally, I have worked on:
> > - ARM: the SSAT/USAT instructions (scalar)
> > - AArch64: the SQ/UQ ADD/SUB AArch64 instructions (vector/scalar
> > saturating arithmetic)
> > - X86: PACK SS/US (vector, saturate+truncate)
> > - X86: PADD/SUB S/US (vector, saturating arithmetic)
> >
> > Anyway, let's first agree on the intrinsics, so that further
> > development is done on trunk.
> >
> > Thanks!
> > -Ahmed
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/e29c2894/attachment.html>
More information about the llvm-dev
mailing list