[llvm-commits] [llvm] r155468 - in /llvm/trunk: lib/Transforms/InstCombine/InstCombineCalls.cpp test/Transforms/InstCombine/2012-04-23-Neon-Intrinsics.ll

Wed Apr 25 19:37:51 PDT 2012

On Wed, Apr 25, 2012 at 7:07 PM, Lang Hames <lhames at gmail.com> wrote:

> Hi Bob, Evan,
>
>> I vaguely remember this. Do you remember why the multiply isn't moved
>> together with the sext / zext?
>>
>>
>> The change was svn 128502.  The commit message doesn't have many details.
>>  It references rdar://8832507 and rdar://9203134.  I took a quick look
>> at 8832507, where I commented that the zext/sext was getting moved by LICM.
>>  Presumably only one of the operands was loop-invariant, so the multiply
>> would remain in the loop.
>>
>
> That makes sense. Thanks for the pointers to the commits/radars too.
>
> I talked this over with Dan Gohman this morning and we came to the
> conclusion that the best way of handling this is probably to add a
> specialized simplify for widening mul intrinsics. It should only simplify
> when both operands are constant (result is constant), or either constant is
> zero or one (result is a sext/zext). This was one of Chandler's suggested
> solutions too. I'll work up a patch soonish.
>

While I like this solution, and particularly for widening multiplies I can
see reasons to specifically prefer it, I'd like to point out one
alternative to the generic problem this thread has touched on: an intrinsic
that expands to multiple IR constructs which we would like to match back to
exactly one instruction.

I understand the difficulty of looking across BB edges and other CFG
elements, seeing through CSE and other foldings which can impact this.
However, we have great tools to do all of these things, especially in the
IR. The whole point of lowering to generic IR constructs is to get access
to these tools.

My idea for how to handle these patterns is to have target-specific ISD
nodes to represent their special semantics, and to use target-specific
combines to reach across CFG and other constructs to form these ISD nodes
even after optimizations. At that layer we can also avoid forming the nodes
when the optimizations that have perturbed the target-independent code have
actually made sufficiently significant optimizations to be superior in code
quality to the instruction the intrinsic would naively have lowered to.

We already do exactly these types of transformations to represent things
like x86 addressing modes, and other complex operations where the original
code was written with very specific intent as to final assembly produced,
so I think that this is feasible to do (if not easy).

Certainly, there is a question of whether it is worth while in each case,
etc etc. I don't see the specific intrinsics in question as necessarily
important to handle in this manner. I just think this pattern is an
important one to consider using in the future as it keeps the IR simple and
canonical, while providing more power than simple pattern matching logic to
select the final instruction.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120425/e19d6146/attachment.html>