[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Wed Jul 11 17:20:52 PDT 2018

Thanks, Richard.

Based on the feedback from this thread, I'll move forward with the
patches for nofree, nosync, adding a new corresponding dereferenceable
attribute (my suggestion is to name this dereferenceable_on_entry;
suggestions welcome), and updating Clang is emit this new attribute
instead of the current one.

 -Hal

On 07/11/2018 06:43 PM, Richard Smith wrote:
> On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
>     [+Richard]
>
>
>     On 07/11/2018 08:29 AM, Sanjoy Das wrote:
>     > I'm not sure if nosynch is sufficient.  What if we had:
>     >
>     > void f(int& x) {
>     >   if (false) {
>     >     int r0 = x;
>     >   }
>     > }
>     >
>     > // other thread
>     > free(<pointer to x>);
>     >
>     > The source program is race free, but LLVM may speculate the read
>     from
>     > x (seeing that it is dereferenceable) creating a race.
>
>     Interestingly, I'm not sure. I trust that Richard can answer this
>     question. :-)
>
>     So, if we had:
>
>     int y = ...;
>     ...
>     f(y);
>
>     then I think that Clang's use of dereferenceable is almost certainly
>     okay (because the standard explicitly says, 9.2.3.2p5, "A reference
>     shall be initialized to refer to a valid object or
>     function."). Because the reference must have been valid when f(y)
>     began
>     executing, unless it synchronizes somehow with the other thread, any
>     asynchronous deletion of y must be a race.
>
>     On the other hand, if we have:
>
>     int &y = ...;
>     ...
>     f(y);
>
>     do we know that, when f(y) begins executing, the reference points to a
>     valid object? My reading of 9.3.3p2, which says, "Argument passing
>     (7.6.1.2) and
>     function value return (8.6.3) are initializations.", combined with the
>     statement above, implies that, perhaps surprisingly, the same holds
>     here. When the argument to f is initialized, it must refer to a valid
>     object (even if the initializer is another reference).
>
>     Richard, what do you think?
>
>
> First, see also core issue 453 <http://wg21.link/cwg453>, under the
> guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from
>
>   "A reference shall be initialized to refer to a valid object or
> function."
>
> to something like
>
>   "If an lvalue to which a reference is directly bound designates
> neither an existing object or function of an appropriate type (11.6.3
> [dcl.init.ref]), nor a region of storage of suitable size and
> alignment to contain an object of the reference's type (4.5
> [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is
> undefined."
>
> My take is that, if the end of the duration of the region of storage
> is unsequenced with respect to the binding of the reference, then
> behavior is undefined. Generally when we refer to a thing happening
> while some condition is true, we mean that the execution point when
> the condition became true is sequenced before the thing happening, and
> the execution point where it becomes not true again is sequenced after.
>
> So the behavior of that program is undefined regardless of whether 'f'
> actually loads through 'x'.
>  
>
>     Thanks again,
>     Hal
>
>     P.S. If I'm right, then I might be happy, but it's also somewhat scary
>     (although we've been doing this optimization for multiple releases
>     and I
>     don't think we have a bug along these lines), and I'd at least
>     smell the
>     need for a sanitizer.
>
>     >
>     > -- Sanjoy
>     > On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
>     > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>     >> Hi, everyone,
>     >>
>     >> I'd like to propose adding a nofree function attribute to
>     indicate that
>     >> a function does not, directly or indirectly, call a
>     memory-deallocation
>     >> function (e.g., free, C++'s operator delete). Clang/LLVM can
>     currently
>     >> misoptimize functions that:
>     >>
>     >>  1. Have a reference argument.
>     >>
>     >>  2. Free the memory backing the object to which the reference
>     is bound
>     >> during the function's execution.
>     >>
>     >> Because we tag, in Clang, all reference arguments using the
>     >> dereferenceable attribute, LLVM assumes that the pointer is
>     >> unconditionally dereferenceable throughout the course of the entire
>     >> function. This isn't true, however, if the memory is freed
>     during the
>     >> execution of the function. For more information, please see the
>     >> discussion in https://reviews.llvm.org/D48239.
>     >>
>     >> To solve this problem, we need to give LLVM more information in
>     order to
>     >> help it determine when a pointer, which is dereferenceable when the
>     >> functions begins to execute, will still be dereferenceable
>     later on in
>     >> the function's execution. This nofree attribute can be part of that
>     >> solution. If we know that free (and friends) are not called by the
>     >> function (nor by any function called by the function, and so
>     on), then
>     >> we know that pointers that started out dereferenceable will
>     stay that
>     >> way (except as explained below).
>     >>
>     >> I'm initially proposing this to be only a function attribute,
>     although
>     >> one could easily imagine a parameter attribute as well (that
>     indicates
>     >> that a particular pointer argument is not freed by the
>     function). This
>     >> might be useful, but for the use case of helping
>     dereferenceable, it
>     >> would be subtle to use, unless the parameter was also marked as
>     noalias,
>     >> because you'd need to know that the parameter was not also
>     aliased with
>     >> another argument (or had not been captured). Another analysis
>     would need
>     >> to provide this kind of information.
>     >>
>     >> Also, just because a function does not, directly or indirectly,
>     call
>     >> free does not mean that it cannot cause memory to be
>     deallocated. The
>     >> function might communicate (synchronize) with another thread
>     causing
>     >> that other thread to delete the memory. For this reason, to use
>     >> dereferenceable as we currently do, we also need to know that the
>     >> function does not synchronize with any other threads. To solve this
>     >> problem, like nofree, I propose to add a nosynch attribute (to
>     indicate
>     >> that a function does not use (non-relaxed) atomics or otherwise
>     >> synchronize with any other threads (e.g., perform I/O or, as a
>     practical
>     >> matter, use volatile accesses).
>     >>
>     >> I've posted a patch for the nofree attribute
>     >> (https://reviews.llvm.org/D49165). nosynch's implementation
>     would be
>     >> very similar (except instead of looking for calls to free, it
>     would look
>     >> for uses of non-relaxed atomics, volatile ops, and known
>     functions that
>     >> are not I/O functions).
>     >>
>     >> With both of these attributes (nofree and nosynch), a function
>     argument
>     >> with the dereferenceable attribute will be known to be
>     dereferenceable
>     >> throughout the execution of the attributed function. We can update
>     >> isDereferenceableAndAlignedPointer to include these additional
>     checks on
>     >> the current function.
>     >>
>     >> One more choice we have: We can, as I proposed above,
>     essentially weaken
>     >> the current semantics of dereferenceable to not exclude
>     >> mid-function-execution deallocation. We can also add a second
>     attribute
>     >> with the current, stronger, semantics. We can keep the current
>     attribute
>     >> as-is, and add a second attribute with the weaker semantics
>     (and switch
>     >> Clang to use that).
>     >>
>     >> Please let me know what you think.
>     >>
>     >> Thanks again,
>     >>
>     >> Hal
>     >>
>     >> --
>     >> Hal Finkel
>     >> Lead, Compiler Technology and Programming Languages
>     >> Leadership Computing Facility
>     >> Argonne National Laboratory
>     >>
>     >> _______________________________________________
>     >> LLVM Developers mailing list
>     >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180711/b4ac5cc2/attachment.html>