[LLVMdev] optimization assumes malloc return is non-null
Jonathan S. Shapiro
shap at eros-os.com
Wed Apr 30 18:35:44 PDT 2008
On Wed, 2008-04-30 at 18:26 -0400, David Vandevoorde wrote:
> > Daveed:
> > Good to know that I was looking at the correct section. I do not agree
> > that your interpretation follows the as-if rule, because I do not
> > agree
> > with your interpretation of the C library specification of malloc().
> Before I go on, let me state that this is not a contentious issue
> among WG14: There is no doubt that the intent of the standard is that
> this be a valid optimization.
And of course, ANSI C working groups are infallible. Consider
But I think I am starting to understand your argument, and if so I
concur that the optimization is correct in this case. Let me see ask
some further questions just to confirm.
Let me attempt to replay what I believe is happening here:
1. There is an observably linear chain of calls to malloc() in the
example, and because of the structure of the example, the compiler can
see that there are no *unobserved* calls to malloc() in the program.
2. After all inlining of malloc() has proceeded, the compiler is free to
observe that all calls succeed by partial evaluation (thus the non-zero
return value) and that after this is done there is no witness to the
internal heap limit variable.
3. Progressive application of dead variable elimination will then delete
the entire chain of updates to the heap limit variable, eliminating the
side effects that would normally be induced by malloc(). They will also
delete the uncaptured pointer result from malloc().
4. Since the result pointer is not captured, and the heap limit variable
is not consulted, the entire result of the call to malloc() is the
return of a non-NULL pointer.
5. Constant folding then reduces the IF condition (malloc() == NULL) to
6. Dead code elimination then reduces the if/then to simply the then
7. Optimizer now reduces the whole thing to "return 1".
Is this more or less the chain of events here (or if not this, then
something of a similar nature)?
If so, I understand and concur that this optimization is valid in this
example, primarily due to perverse and non-obvious consequences of the
particular bit of sample code.
Just to cement my understanding, I suspect that *either* of the
following two changes to the sample code would make this optimization
1. Rename main() to f()
2. Introduce a call to some external procedure anywhere in main().
In either case, the compiler can no longer witness all reachable calls
to malloc(), and must assume conservatively that the internally produced
side effect results of malloc() may be sourced by other units of
compilation, and therefore cannot be elided.
Is this correct?
Hmm. No. I can weasel out of this in case (2), because the standard does
not require that calls to malloc() be processed in order, so in some
cases calls to malloc() appearing in main() can still be eliminated by
logically rearranging them to occur before any of the external calls to
malloc(). I bet LLVM doesn't know about this. :-)
> (I should have been more careful in my earlier response and clarify
> that modifying a nonvolatile object is a side-effect that can be
> optimized away in many cases.)
If I understand the standard correctly, this can be done exactly if
there is no witness to the side effect. Furthermore, chains of
intermediate unwitnessed side effects can be collapsed if the optimizer
can prove that the intermediate results are unwitnessed. Yes?
> ...However, the optimization is allowed even without that: Just by noting
> that the return value of malloc isn't used (nor leaked); I suspect
> that's what the compiler does in this case (though I don't know).
I think that is not quite strong enough. If there is any possibility of
preceding or succeeding calls to malloc() in compilation units that are
not visible at compile time, then the compiler cannot determine that the
current call to malloc() must succeed or that it's side effects are
I suppose that the compiler can do special case things in main on the
theory that these calls to malloc() may be executed out of order by a
conforming implementation, but I suspect that goes well beyond the
cleverness of the current LLVM implementation. It's the sort of thing
that the VAX/VMS optimizer or the IBM PL.8 research optimizer might
actually have done. And certainly the Bulldog optimizer.
> > Setting the matter of the standard entirely aside, the currently
> > implemented behavior deviates so blatantly from common sense and
> > universal convention that it really must be viewed as a bug.
Just to be explicit: I now see that my statement above is wrong in the
Daveed: Thanks for being patient on this. I appreciate your willingness
to help me see this one.
More information about the llvm-dev