RFC: min/max/abs IR intrinsics
Philip Reames
listmail at philipreames.com
Sun Apr 26 12:04:22 PDT 2015
On 04/23/2015 07:42 AM, James Molloy wrote:
> Hi all,
>
> I've just started again on trying to solve the problem of getting
> decent code generation for min, max and abs idioms as used by the
> programmer and as emitted by the loop vectorizer.
>
> I've been looking at doing this as a target DAGCombine, but actually I
> think:
> 1. it's incredibly complex to do at that stage and it limits all the
> work I do to just one target.
> 2. It's also much more difficult to test.
> 3. The loop and SLP vectorizers still don't have a cost model for
> them - they're just seen as compare+selects.
I don't see the challenge here. Matching a compare+select as a min/max
for the purpose of the cost model under a target specific hook seems
quite straightforward. What am I missing?
> So my proposal is:
> * To add new intrinsics for minimum, maximum and absolute value.
> These would have signed int/unsigned int/float variants and be valid
> across all numeric types.
> * To add a pass fairly early in the pipeline to idiom recognize and
> create intrinsics. This would be controllable per-backend - if a
> backend doesn't have efficient lowering for these operations, perhaps
> it's best not to do the idiom recognition.
I am strongly opposed to this part of the proposal. I have no problem*
adding such intrinsics and matching them late (i.e. CodeGenPrep), but I
am deeply concerned about the negative impacts of matching early.
Unless you are volunteering to add support for these intrinsics to
*every* pass, I believe doing this is a non-starter. As a good example,
consider what happened recently with the x.with.overflow intrinsics
where we were missing important simplifications on induction variable
dependent checks due to early canonicalization to a form that the rest
of the optimizer didn't understand.
More generally, I'm not even sure matching these early would be the
right answer even if you were volunteering to update the entire
optimizer. Being able to fold the condition (CSE, etc..) independently
of the select and then being able to exploit a dominating branch is
extremely powerful at eliminating the min/max operation entirely. I
would be deeply concerned about giving up that power without an
incredible compelling reason.
* By "no problem", I really mean that I have no opinion here. I am
neither endorsing nor opposing.
>
> The cost model would then fall out in the wash, because we already
> have a cost model for intrinsics, it would be as simple as adding new
> instructions.
> Because we idiom recognize at the IR stage instead of the SDAG stage,
> we also wouldn't have to rely on the min/max idioms being in canonical
> "select" form; we could match a branch sequence also.
Er, not sure I get your point here. Not having to match two distinct
families of representation is an advantage, not a disadvantage. The
branch form should be getting converted into the select form much
earlier in the optimizer. Which cases are you worried about here?
>
> What do you think? Is this an acceptable proposal?
>
> Cheers,
>
> James
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150426/859003de/attachment.html>
More information about the llvm-commits
mailing list