[LLVMdev] RFC: Integer saturation intrinsics

Sun Jun 19 11:34:20 PDT 2011

On 06/19/2011 11:20 AM, Eli Friedman wrote:
> On Sun, Jun 19, 2011 at 10:09 AM, Evan Cheng<evan.cheng at apple.com>  wrote:
>>
>> On Jun 17, 2011, at 5:49 PM, Eli Friedman wrote:
>>
>>> On Fri, Jun 17, 2011 at 4:50 PM, Eli Friedman<eli.friedman at gmail.com>  wrote:
>>>>>>> The plan is to form calls to these intrinsics in InstCombine. Legalizer can expand these intrinsics if they are not legal. The expansion should be fairly straight forward and produce code that is at least as good as what LLVM is currently generating for these code sequence.
>>>>>>>
>>>>>>> Comments?
>>>>>>
>>>>>> Is there some reason why pattern-matching this in an ARM-specific
>>>>>> DAGCombine doesn't work?
>>>>>
>>>>> It's not possible to look beyond a single BB at isel time.
>>>>
>>>> Anything that we can match to ssat should be of the form max(min(x,
>>>> SATMAX), SATMIN) (where max and min are icmp+select pairs).  If the
>>>> min and max aren't in the same block, and we don't have an IR
>>>> transformation to put them in the same block, we should fix that
>>>> rather than introducing an instrinsic for this special case, I
>>>> think...
>>>
>>> Okay, thinking about it a bit more, I don't think this is practical.
>>
>> I think this will solve this particular problem but I'm not certain it's the right thing to do in general. In general, llvm already has the tendency to form selects too aggressively. What's your concern?
>
> I wasn't really considering the issues with converting branches into
> selects, actually... I didn't realize that's what you were running
> into. :)  I was thinking more in terms of instruction sinking, which
> we really don't have an aggressive IR pass for at the moment.  But
> you're right, converting chains of branches like that into selects
> without considering what we'll end up with is a bit suspicious.

I ran into this from the other angle recently. The middle-end isn't 
being aggressive enough in folding branches into selects which is 
blocking desired optimization.

I was thinking we may want to add a backend pass that decides what to do 
about long chains of computations leading into select instructions. The 
backend would have a better idea of how expensive a branch is compared 
to the instructions, or whether the instructions could be predicated, etc.

Nick

>>> I'm still skeptical that adding platform-independent intrinsics for
>>> arbitrary ARM instructions is a good idea simply because we don't have
>>> the infrastructure to handle them otherwise.  It wouldn't be
>>> especially hard to allow target-specific transforms on IR...
>>
>> It's easy to expand these intrinsics in legalizer time. Also llvm is already generating inferior code for these patterns for ARM and x86.
>
> We also have to consider whether forming sset will require substantial
> changes to other passes; thinking about it, it won't be too bad here.
>
> I still don't see the point of a usat intrinsic; we ought to already
> handle constructs that should form into usat effectively.
>
> -Eli
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>