RFC: min/max/abs IR intrinsics

Philip Reames listmail at philipreames.com
Sun Apr 26 12:04:22 PDT 2015

On 04/23/2015 07:42 AM, James Molloy wrote:
> Hi all,
> I've just started again on trying to solve the problem of getting 
> decent code generation for min, max and abs idioms as used by the 
> programmer and as emitted by the loop vectorizer.
> I've been looking at doing this as a target DAGCombine, but actually I 
> think:
>   1. it's incredibly complex to do at that stage and it limits all the 
> work I do to just one target.
>   2. It's also much more difficult to test.
>   3. The loop and SLP vectorizers still don't have a cost model for 
> them - they're just seen as compare+selects.
I don't see the challenge here.  Matching a compare+select as a min/max 
for the purpose of the cost model under a target specific hook seems 
quite straightforward.  What am I missing?

> So my proposal is:
>   * To add new intrinsics for minimum, maximum and absolute value. 
> These would have signed int/unsigned int/float variants and be valid 
> across all numeric types.
>   * To add a pass fairly early in the pipeline to idiom recognize and 
> create intrinsics. This would be controllable per-backend - if a 
> backend doesn't have efficient lowering for these operations, perhaps 
> it's best not to do the idiom recognition.
I am strongly opposed to this part of the proposal.  I have no problem* 
adding such intrinsics and matching them late (i.e. CodeGenPrep), but I 
am deeply concerned about the negative impacts of matching early.  
Unless you are volunteering to add support for these intrinsics to 
*every* pass, I believe doing this is a non-starter.  As a good example, 
consider what happened recently with the x.with.overflow intrinsics 
where we were missing important simplifications on induction variable 
dependent checks due to early canonicalization to a form that the rest 
of the optimizer didn't understand.

More generally, I'm not even sure matching these early would be the 
right answer even if you were volunteering to update the entire 
optimizer.  Being able to fold the condition (CSE, etc..) independently 
of the select and then being able to exploit a dominating branch is 
extremely powerful at eliminating the min/max operation entirely.  I 
would be deeply concerned about giving up that power without an 
incredible compelling reason.

* By "no problem", I really mean that I have no opinion here.  I am 
neither endorsing nor opposing.
> The cost model would then fall out in the wash, because we already 
> have a cost model for intrinsics, it would be as simple as adding new 
> instructions.

> Because we idiom recognize at the IR stage instead of the SDAG stage, 
> we also wouldn't have to rely on the min/max idioms being in canonical 
> "select" form; we could match a branch sequence also.
Er, not sure I get your point here.  Not having to match two distinct 
families of representation is an advantage, not a disadvantage.  The 
branch form should be getting converted into the select form much 
earlier in the optimizer.  Which cases are you worried about here?
> What do you think? Is this an acceptable proposal?
> Cheers,
> James
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150426/859003de/attachment.html>

More information about the llvm-commits mailing list