[llvm-commits] [llvm] r150060 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86InstrSSE.td

Thu Feb 9 12:36:08 PST 2012

Nick Lewycky <nlewycky at google.com> writes:

>     That seems like a mistake and is not what I would expect to happen with
>     intrinsics.  I can understand why compiler developers might want to do
>     that but some users will be surprised.
>
> No, it's not the compiler developers. Our programmers expect the
> compiler to be capable of comprehending what the builtins mean and
> perform constant folding, licm, etc., and will file bugs when we don't
> emit optimal code.

Ok, so your customers want exactly the opposite of what ours do.  :)

>     > Suppose you emit code like this: create your own functions with
>     > definitions that use the IR implementations, and mark them
>     > noinline. At the very end, you inline them with a pass that calls
>     > InlineFunction() directly. Does this preserve the order, or do you
>     > still have trouble with the backend doing too much reordering?
>    
>     It's not just reordering.  Instcombine, dagcombine, etc. do a lot of
>     massaging.
>
> Sure, but with what I proposed instcombine can't touch them
> (technically it *can*, but it won't do anything since each little
> intrinsic definition is already locally optimal).

But there's no guarantee that codegen will take the generic IR and
always emit the same instruction either.  I can't predict how the
pattern matching will change in the coming years.

This also seems like a lot of work that obfuscate the IR and make
debugging more cumbersome.

> However, DAGCombine might fold them. If this is a problem in practice,
> my next idea is to wrap the functions in compiler barriers. Would that
> sufficient to prevent the problems you'd have?

It would hurt other optimization.

>     Again, what is the trouble with keeping intrinsics?  Why rip them out if
>     people find them useful and necessary?  If the behavior of the optimizer
>     changes wrt the intrinsics, we can deal with that when it happens.
>
> Your argument that the intrinsics are useful is falling flat because
> the use-case you've given is not one that would be solved by the
> presence of intrinsics. 

I think "intrinsics" needs a better definition then.  When I think of
these kinds of intrinsics, I think, "do exactly this instruction."

Why not have a generic shufflevector intrinsic and have users use that
if they want them massaged?

> Just having intrinsics doesn't guarantee that they won't be "massaged"
> by the compiler. Also, waiting until you encounter a problem with the
> optimizer is only deferring the problem; when you bring up that issue
> the response will be "working as intended" and we'll be right back
> here.

But it will at least buy some time for us to figure out an alternative.
I'm very lucky that I caught that commit.  I'm sure I missed a bunch of
others that already ripped stuff out that we use.

> It's entirely possible that the solution to your problem will involve
> intrinsics, but we have to work out exactly what. We could try adding
> an i1 flag to the intrinsics that indicate whether it should be
> treated as volatile. We could add a volatile bit to the
> IntrinsicInst/CallInst, maybe. 

Actually, we probably already need a volatile bit on a call to prevent
code motion as I posted about a couple of weeks ago.  It's an
interesting idea for intrinsics and I'm definitely open to that.

> We could add a new bit to some immutable spot (similar to
> TargetLibraryInfo) to indicate whether intrinsics are sacred or
> not. We could turn it around and say that the compiler may not
> optimize intrinsics except in a single pass, much like how only
> SimplifyLibCalls was only place allowed to assume that C functions had
> the behaviour their names implied (before we had TargetLibraryInfo).
>
> I'm trying to start with approaches that serve your use-case with the
> minimal change to LLVM first.

Yep, I appreciate that.  I do like the idea of a bit that controls
whether intrinsics are massaged or not.  Whether it's on each call or a
global bit I don't think would matter to us.  Of course the former is
more flexible.

                             -Dave