[llvm-commits] [llvm] r150060 - in /llvm/trunk: include/llvm/IntrinsicsX86.td lib/Target/X86/X86ISelLowering.cpp lib/Target/X86/X86InstrSSE.td
nlewycky at google.com
Wed Feb 8 16:33:54 PST 2012
On 8 February 2012 15:54, David A. Greene <dag at cray.com> wrote:
> Nick Lewycky <nlewycky at google.com> writes:
> > Whether generic IR is a "win" isn't the primary issue here. The
> > is that the user wrote code using intrinsics and expects that exact
> > to be in the asm. Whether or not that is the best performing code
> > possible doesn't matter. It's what they want and we have to respect
> > Our customers demand it.
> > Okay. This is a different issue then, and isn't even solved by using
> > intrinsics; the optimization passes are absolutely allowed to modify
> > uses of intrinsics. We do this in obvious cases for things like memset
> > or ctz, but there is no reason the optimization passes won't optimize
> > the llvm.x86 intrinsics too. It's rare, but you can see code in
> > ValueTracking.cpp that will analyze Intrinsic::x86_sse42_crc32_64_8
> > for one example, ConstantFolding.cpp will fold
> > through x86_sse_cvtss2si for another.
> That seems like a mistake and is not what I would expect to happen with
> intrinsics. I can understand why compiler developers might want to do
> that but some users will be surprised.
No, it's not the compiler developers. Our programmers expect the compiler
to be capable of comprehending what the builtins mean and perform constant
folding, licm, etc., and will file bugs when we don't emit optimal code.
We need some way to provide the kind of guarantee I'm talking about.
> Suppose you emit code like this: create your own functions with
> > definitions that use the IR implementations, and mark them
> > noinline. At the very end, you inline them with a pass that calls
> > InlineFunction() directly. Does this preserve the order, or do you
> > still have trouble with the backend doing too much reordering?
> It's not just reordering. Instcombine, dagcombine, etc. do a lot of
Sure, but with what I proposed instcombine can't touch them (technically it
*can*, but it won't do anything since each little intrinsic definition is
already locally optimal). If you really have problems with the IR-level
optimizers messing with them, you can make the functions declarations until
you reach your IR-level pass, filling in their function bodies just before
you run InlineFunction.
However, DAGCombine might fold them. If this is a problem in practice, my
next idea is to wrap the functions in compiler barriers. Would that
sufficient to prevent the problems you'd have?
Again, what is the trouble with keeping intrinsics? Why rip them out if
> people find them useful and necessary? If the behavior of the optimizer
> changes wrt the intrinsics, we can deal with that when it happens.
Your argument that the intrinsics are useful is falling flat because the
use-case you've given is not one that would be solved by the presence of
intrinsics. Just having intrinsics doesn't guarantee that they won't be
"massaged" by the compiler. Also, waiting until you encounter a problem
with the optimizer is only deferring the problem; when you bring up that
issue the response will be "working as intended" and we'll be right back
It's entirely possible that the solution to your problem will involve
intrinsics, but we have to work out exactly what. We could try adding an i1
flag to the intrinsics that indicate whether it should be treated as
volatile. We could add a volatile bit to the IntrinsicInst/CallInst, maybe.
We could add a new bit to some immutable spot (similar to
TargetLibraryInfo) to indicate whether intrinsics are sacred or not. We
could turn it around and say that the compiler may not optimize intrinsics
except in a single pass, much like how only SimplifyLibCalls was only place
allowed to assume that C functions had the behaviour their names implied
(before we had TargetLibraryInfo).
I'm trying to start with approaches that serve your use-case with the
minimal change to LLVM first.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits