[llvm-dev] Potential issue with noalias @malloc and @realloc

Richard Smith via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 12 12:06:16 PDT 2017


On 11 April 2017 at 18:16, Daniel Berlin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> The corresponding line does not exist, and it is in fact, wrong :)
>
> C11 says nothing like that.
> C++14 says:
> "The lifetime of an object of type T begins when:
> — storage with the proper alignment and size for type T is obtained, and —
> if the object has non-trivial initialization, its initialization is
> complete.
> The lifetime of an object of type T ends when:
> — if T is a class type with a non-trivial destructor (12.4), the
> destructor call starts, or
> — the storage which the object occupies is reused or released."
>
> it also says:
> "If, after the lifetime of an object has ended and before the storage
> which the object occupied is reused or released, a new object is created at
> the storage location which the original object occupied, a pointer that
> pointed to the original object, a reference that referred to the original
> object, or the name of the original object will automatically refer to the
> new object and, once the lifetime of the new object has started, can be
> used to manipulate the new object, if:
>

This wording describes what happens if you, for instance, placement new
over an existing object. It is not intended to cover the case where you
happen to reallocate storage you previously freed; see below for the rule
on that case.


> ...
> a bunch of conditions".
>
> Which makes it even worse because they become aliasing again.
>

I believe this is the portion of the C++ standard you're looking for:
[basic.stc]/4:

"When the end of the duration of a region of storage is reached, the values
of all pointers representing the address of any part of that region of
storage become invalid pointer values (6.9.2). Indirection through an
invalid pointer value and passing an invalid pointer value to a
deallocation function have undefined behavior. Any other use of an invalid
pointer value has implementation-defined behavior."

(In C++14 and before, this was [basic.dynamic.deallocation]p4 and only
applied to the effects of invoking 'operator delete', but we added the
above wording as a defect resolution so it's intended to apply
retroactively.)

So the comparison in the source program has UB, and the loop unswitching
transformation is therefore invalid as an operation on C++ programs. (I
don't think this observation helps, though, since we -- presumably -- want
the transformation to be valid as an operation on LLVM IR...)

It seems to me that there are two ways of thinking about this: either the
value of a pointer in IR is richer than its bit sequence, in which case
replacing p1 with p0 in a block predicated by p0 == p1 is an incorrect
transformation if you cannot prove that one pointer was based on the other,
or the value of a pointer in IR is exactly its bit sequence, in which case
the code performing the transformation incorrectly updated the IR and a
correct transformation would need to somehow remove the noalias from the
malloc calls. The C++ object model formally takes the former standpoint;
its pointers notionally point to objects, which are abstract entities
occupying storage, rather than pointing to the storage itself.

On Tue, Apr 11, 2017 at 5:09 PM, Flamedoge via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I don't know when this was added on cppreference but
>>
>> > The behavior is undefined if after free() returns, an access is made
>> through the pointer ptr (unless another allocation function happened to
>> result in a pointer value equal to ptr)
>>
>> This seems to suggest that there is no UB... However, I couldn't find the
>> corresponding line or relevant part on latest C std,
>> http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1570.pdf
>>
>> Regards,
>> Kevin
>>
>> On Tue, Apr 11, 2017 at 4:27 PM, Sanjoy Das <
>> sanjoy at playingwithpointers.com> wrote:
>>
>>> Hi Kevin,
>>>
>>> On April 11, 2017 at 4:14:14 PM, Flamedoge (code.kchoi at gmail.com) wrote:
>>> > So only "non-freed" malloc pointers are No-Alias which makes it
>>> > flow-sensitive. There is no reason why malloc couldn't return
>>> previously
>>> > freed location.
>>>
>>> Yes.
>>>
>>> Talking to Nick Lewycky on IRC, I figured out a shorter way of saying
>>> what I wanted to say.  We know that programs like this are UB in C:
>>>
>>> p0 = malloc();
>>> free(p0);
>>> p1 = malloc();
>>> if (p0 == p1) {
>>>   int v = *p0; // Semantically free'ed but bitwise equal to an allocated
>>> value
>>> }
>>>
>>> and we relied on them having UB when marking malloc's return value as
>>> noalias.
>>>
>>> However, we can end up in cases like the above by applying
>>> loop-unswitch + GVN to well defined C programs.
>>>
>>> -- Sanjoy
>>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170412/bc646f76/attachment.html>


More information about the llvm-dev mailing list