[LLVMdev] optimization assumes malloc return is non-null

Török Edwin edwintorok at gmail.com
Wed Apr 30 14:58:10 PDT 2008


Jonathan S. Shapiro wrote:
> On Wed, 2008-04-30 at 15:25 -0400, David Vandevoorde wrote:
>   
>> On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote:
>>     
>>> Daveed:
>>>
>>> Perhaps I am looking at the wrong version of the specification.  
>>> Section
>>> 5.1.2.3 appears to refer to objects having volatile-qualified type.  
>>> The
>>> type of malloc() is not volatile qualified in the standard library
>>> definition.
>>>       
>> ...malloc() is not specified to access a volatile  
>> object, modify an object, or modifying a file (directly or  
>> indirectly); i.e., it has no side effect from the language point of  
>> view.
>>     
>
> Daveed:
>
> Good to know that I was looking at the correct section. I do not agree
> that your interpretation follows the as-if rule, because I do not agree
> with your interpretation of the C library specification of malloc().
>
> The standard library specification of malloc() clearly requires that it
> allocates storage, and that such allocation is contingent on storage
> availability. Storage availability is, in turn, a function (in part) of
> previous calls to malloc() and free(). Even if free() is not called, the
> possibility of realloc() implies a need to retain per-malloc() state. In
> either case, it follows immediately that malloc() is stateful, and
> therefore that any conforming implementation of malloc() must modify at
> least one object in the sense of the standard.
>
> If I understand your position correctly, your justification for the
> optimization is that the C library standard does not say in so many
> words that malloc() modifies an object. I do not believe that any such
> overt statement is required in order for it to be clear that malloc() is
> stateful. The functional description of malloc() and free() clearly
> cannot be satisfied under the C abstract machine without mutation of at
> least one object.
>
> Also, I do not read 5.1.2.3 in the way that you do. Paragraph 2 defines
> "side effect", but it does not imply any requirement that side effects
> be explicitly annotated. What Paragraph 3 gives you is leeway to
> optimize standard functions when you proactively know their behavior. A
> standard library procedure is not side-effect free for optimization
> purposes by virtue of the absence of annotation. It can only be treated
> as side-effect free by virtue of proactive knowledge of the
> implementation of the procedure. In this case, we clearly have knowledge
> of the implementation of malloc, and that knowledge clearly precludes
> any possibility that malloc is simultaneously side-effect free and
> conforming.
>
> So it seems clear that this optimization is wrong. By my reading, not
> only does the standard fail to justify it under 6.1.2.3 paragraph 3, it
> *prohibits* this optimization under 5.1.2.3 under Paragraph 1 because
> there is no conforming implementation that is side-effect free.
>
> Exception: there are rare cases where, under whole-program optimization,
> it is possible to observe that free() is not called, that there is an
> upper bound on the number of possible calls to malloc() and also an
> upper bound on the total amount of storage allocated. In this very
> unusual case, the compiler can perform a hypothetical inlining of the
> known implementation of malloc and then do partial evaluation to
> determine that no heap size tracking is required. If so, it can then
> legally perform the optimization that is currently being done.
>
> But I don't think that the current compiler is actually doing that
> analysis in this case...
>
>   
>>> In general, calls to procedures that are outside the current unit of
>>> compilation are presumed to involve side effects performed in the body
>>> of the external procedure (at least in the absence of annotation).
>>>       
>> That may often be done in practice, but it's not a language  
>> requirement.  In particular, for standard library functions (like  
>> malloc) an optimizer can exploit the known behavior of the function.
>>     
>
> I disagree. In the malloc() case, the known behavior is side effecting.
> In the general case, the compiler cannot assume side-effect freedom
> unless it can prove it, and in the absence of implementation knowledge
> the standard requires conservatism.

Although the ISO standard doesn't say anything about malloc setting
errno, POSIX does: "Otherwise, it shall return a null pointer  and set
/errno/ to indicate the error".
IMHO setting errno can be considered a side-effect.

Best regards,
--Edwin



More information about the llvm-dev mailing list