[LLVMdev] optimization assumes malloc return is non-null
Mike Stump
mrs at apple.com
Sun May 4 19:29:21 PDT 2008
On May 1, 2008, at 3:39 PM, David Vandevoorde wrote:
>
> Not quite. Although there is a requirement there (and more precise
> ones in Clause 3), there is no prohibition against doing additional,
> observable stuff (e.g., log the calls) and hence allocations cannot
> be elided.
That's correct, there is no prohibition, but, if one does, there are
no requirements placed upon the semantics of the program, none:
--If a program contains a violation of a rule for which no
diagnostic
is required, this International Standard places no requirement
on
implementations with respect to that program.
1.4.12 undefined behavior
[defns.undefined]
behavior, such as might arise upon use of an erroneous program
con-
struct or of erroneous data, for which the Standard imposes
no
requirements.
You may also want to read these which cover the basic idea:
17.1.6 default behavior
[defns.default.behavior]
a description of replacement function and handler function
semantics.
Any specific behavior provided by the implementation, within the
scope
of the required behavior.
17.1.14 replacement function
[defns.replacement]
a reserved function whose definition is provided by a C++
program.
Only one definition for such a function is in effect for the
duration
of the program's execution, as the result of creating the
program
(_lex.phases_) and resolving the definitions of all translation
units
(_basic.link_).
17.1.15 required behavior
[defns.required.behavior]
a description of replacement function and handler function
semantics,
applicable to both the behavior provided by the implementation and
the
behavior that shall be provided by any function definition in the
pro-
gram. If a function defined in a C++ program fails to meet
the
required behavior when it executes, the behavior is undefined.
4 For non-reserved replacement and handler functions, Clause _lib.lan-
guage.support_ specifies two behaviors for the functions in
question:
their required and default behavior. The default behavior
describes a
function definition provided by the implementation. The
required
behavior describes the semantics of a function definition provided
by
either the implementation or a C++ program. Where no distinction
is
explicitly made in the description, the behavior described is
the
required behavior.
1 Clauses _lib.language.support_ through _lib.input.output_ describe the
behavior of numerous functions defined by the C++ Standard
Library.
Under some circumstances, however, certain of these function
descrip-
tions also apply to replacement functions defined in the
program
(_lib.definitions_).
17.4.3.6 Other functions
[lib.res.on.functions]
1 In certain cases (replacement functions, handler functions, operations
on types used to instantiate standard library template
components),
the C++ Standard Library depends on components supplied by a C++
pro-
gram. If these components do not meet their requirements, the
Stan-
dard places no requirements on the implementation.
2 In particular, the effects are undefined in the following cases:
--for replacement functions (_lib.new.delete_), if the
installed
replacement function does not implement the semantics of the
appli-
cable Required behavior paragraph.
--for handler functions (_lib.new.handler_,
_lib.terminate.handler_,
_lib.unexpected.handler_), if the installed handler function
does
not implement the semantics of the applicable Required
behavior
paragraph
These are meant to constrain the replacements as I've described.
> No, because both "delete" and "new" may have side effects.
No, please read the above paragraphs again.
> The problem is that with a user-written operator new, there may be
> other valid ways (besides operator new) to get at the storage
> returned. The object lifetime rules don't always give such programs
> undefined behavior.
The reason why all that wording was included was to ensure the
semantics of the implementation when the standard library uses the
user's new/delete operator. If the user does something funny and
causes the library to not work, the standard places the responsibility
squarely in the users lap. We do that through the wording above. If
the user meets the requirements, then:
a = new stack<int>;
produces no observable semantics. If there were no such constraint,
then that same line could produce:
new allocated 24 bytes at 0xfe4238
on the standard output device (cout/stdout). The only thing that
ensures that line doesn't produce that output is the _hard_
requirement on the users program. Now, we knew that some users would
want to put a print in new/delete implementations and do exactly as
above. We didn't have a good way to describe that we wanted to happen
and we didn't spend the effort to describe it well, so, we choose the
cheap way out and described exactly what they had to do, no more, no
less for the standard to place any requirement on the semantics.
There would be little point in insuring the semantics of programs when
replacements are used if the replacements could foster the reuse of in
use memory. At best, I think you either run into unspecified behavior
(for reads) or undefined behavior (for writes) for allocations by the
system routines. Though, I do agree, the standard could be made more
clear exactly what we meant.
> (I personally consider this a defect in the standard, but I don't
> think that's unarguable.)
> Which part of the standard would I be missing that implies so?
A C++ implementation pro-
vides access to, and management of, dynamic storage via the
global
allocation functions operator new and operator new[] and the
global
deallocation functions operator delete and operator delete[].
The standard is not meant to be read outside the scope of english nor
the usual and customary terms used by computer science. It that were
meant to be the case, we'd have a formal definition for C++. The
standard isn't a formal definition.
That says that new is an allocation function and that delete is a
deallocation function. These are the usual and customary CS terms.
> it may be the object lifetime rules. I'm pretty sure it would work
> if instead of
> "double", a type with a destructor was used. Worth investigating?
Yes, there are various bits in the lifetime rules that also constrain:
5 Before the lifetime of an object has started but after the storage
which the object will occupy has been allocated34) or, after the
life-
time of an object has ended and before the storage which the
object
occupied is reused or released, any pointer that refers to the
storage
location where the object will be or was located may be used but
only
in limited ways. Such a pointer refers to allocated
storage
(_basic.stc.dynamic.deallocation_), and using the pointer as if
the
pointer were of type void*, is well-defined. Such a pointer may
be
dereferenced but the resulting lvalue may only be used in
limited
ways, as described below.
[ see the rest of the standard for all the gory details. ]
>>> A lot of nice guarantees that we have with malloc/free aren't
>>> available with new/delete. Also, since new/delete can be overridden
>>> at any time (as late as runtime with LD_PRELOAD and friends),
>>
>> 3 The program's definitions are used instead of the default
>> versions
>> supplied by the implementation (_dcl.fct.def_). Such
>> replacement
>> occurs prior to program startup (_basic.def.odr_, _basic.start_).
>>
>> So, the replacement is done before start, if later, there are no
>> requirements. And, the replacement has known semantics.
>
>
> But isn't that still too late?
Too late? What could that possibly mean? There is by definition
nothing before the start. Talking about something that happens first
as being too late is non-sensical (by this, we merely mean outside the
bounds of the standard and by that, we mean, the standard places no
requirements on such a program).
> The optimizer often must make decisions way before that.
There is no optimizer in the standard, so there is no way to even talk
about it. All there is, are the abstract semantics when the program
is run and the constraint on the actual implementation semantics that
derive from the abstract semantics.
> (Plus, even though the standard currently fails to address dynamic
> libraries, in practice implementations must keep things working
> right.)
The standard places no requirements in the face of shared libraries, I
know, it kinda sucks, but there it is. The implementation is free to
map onto the standard as it sees fit however. The quality can be
judged by the user however.
More information about the llvm-dev
mailing list