[llvm-commits] [llvm] r146357 - in /llvm/trunk: include/llvm/Intrinsics.td lib/Analysis/ConstantFolding.cpp lib/Transforms/Scalar/SimplifyLibCalls.cpp lib/VMCore/AutoUpgrade.cpp

Chandler Carruth chandlerc at google.com
Mon Dec 12 16:03:35 PST 2011


An attempt at clarification (I still am not going to pick a side here, I'm
just interested in the results):

On Mon, Dec 12, 2011 at 3:47 PM, Chris Lattner <clattner at apple.com> wrote:

> The feeling was that it regresses on useful functionality: a frontend may
> want "defined at zero semantics" and codegen should be able to legalize in
> a select if needed.


I think Duncan agreed with this, but felt that codegen could match that
pattern of code into the instruction where necessary. This seems plausible
to me as it should merely be a comparison, the intrinsic, and a select. No
CFG involved.

Having the undef bit also allows the optimizer to infer that the input
> can't be zero, allowing potentially cheaper instruction sequences to be
> synthesized by codegen etc.


Also, I think Duncan is saying make the actual intrinsic spec that its
result is undef for a zero input. Code which needs a defined result must
use a comparison and a select to ignore the result of the intrinsic.

I think these are functionally equivalent. Let's call A the proposal I
originally made (and which Duncan is arguing for), and B the current
solution.

With A we have a simpler spec for the IR and for the ISD nodes in the
codegen DAG. However, if a frontend wishes to provide a defined result for
zero input, it must produce more complex IR, and if the backend wishes to
produce efficient code for such constructs, it much use a more complex
pattern.

With B we have a more complex spec for the IR, but it is now trivial for
the frontend to select either behavior. The codegen DAG remains more
complex in specification because we don't have the facilities in the
codegen layer for manipulating immediates nearly as easily as we do in IR,
and therefore we decompose the flag into two ISD nodes. I actually tried
both implementations, and separate nodes was *significantly* simpler. Cases
such as vector type legalization make it very useful to have the ISD nodes
be unary. With B, the complexity in the backend is centered in the
target-independent layer rather than in each target's patterns.


Anyways, I'm carrying on with implementing plan B as that was the favored
one previously. If this discussion goes in a new direction, I'll happily
convert everything to that new direction, be it plan A or plan C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111212/7810d921/attachment.html>


More information about the llvm-commits mailing list