[LLVMdev] [RFC] Integer Saturation Intrinsics

Chris Lattner clattner at apple.com
Thu Jan 29 22:44:20 PST 2015


On Jan 15, 2015, at 10:26 AM, Philip Reames <listmail at philipreames.com> wrote:
> On 01/14/2015 04:16 PM, Ahmed Bougacha wrote:
>> On Thu, Jan 15, 2015 at 12:42 AM, Philip Reames
>> <listmail at philipreames.com> wrote:
>>> At a very high level, why do we need these intrinsics?
>> In short, to catch sequences you can't catch in the SelectionDAG.
>> 
>>> What is the use case?  What are typical values for N?
>> Typically, you get this from (a little overlapping) compression, DSP,
>> or pixel-handling code.
>> Off the top of my head, this occurs in paq8p in the test-suite, as
>> well as a few other tests.
>> 
>> You'd have something like:
>>     a = x + y;
>>     if (a < -128)
>>       a = -128;
>>     if (a > 127)
>>       a = 127;
>> 
>>> Have you looked at just generating the conditional IR and then pattern
>>> matching late?  What's the performance trade off involved?
>> That's a valid concern.  The original problem is, we can't catch this
>> kind of thing in the SelectionDAG, because we're limited by a single
>> basic block.  I guess we could (and I gather that's the alternative
>> you're presenting?) canonicalize the control flow to the 2icmp+2select
>> sequence, but I wasn't sure that was "workable".  Truth be told, I
>> didn't investigate this very thoroughly, as I didn't expect reluctance
>> on adding intrinsics!  I'll look into it some more: avoid adding the
>> intrinsic, keep the codegen additions as is, match the pattern in CGP
>> instead of InstCombines.
> Just to be clear, I'm not saying "don't add an intrinsic".  I am saying "make sure the cost of the intrinsic are worth it".  In particular, I think you're going to give up a lot of optimization benefit in practice by using intrinsics unless you put a *lot* of effort into making it work everywhere.

To resurrect an old thread, MHO is that adding intrinsics for these is the right way to go.  As Ahmed points out, pattern matching these can be complicated and is best done by the middle end.  There are clear optional expansion patterns for backends with various features (including the obvious generic expansion using cmove).  

These are the same reasons that we have an intrinsic for bswap, and it has worked out really well.  I agree that this shouldn’t be a flag on the add instruction.

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150129/08fe43f0/attachment.html>


More information about the llvm-dev mailing list