[cfe-dev] Can indirect class parameters be noalias?

Richard Smith via cfe-dev cfe-dev at lists.llvm.org
Wed Jul 29 14:56:32 PDT 2020


On Wed, 29 Jul 2020 at 14:42, Richard Smith <richard at metafoo.co.uk> wrote:

> On Wed, 29 Jul 2020 at 12:52, John McCall <rjmccall at apple.com> wrote:
>
>> Clang IRGen currently doesn’t mark indirect parameters as noalias.
>> Considerations:
>>
>>    -
>>
>>    A lot of targets don’t pass struct arguments indirectly outside of
>>    C++, but some do, notably AArch64.
>>    -
>>
>>    In a pure C world, we would always be able to mark such parameters
>>    noalias, because arguments are r-values and there’s no way to have a
>>    pointer to an r-value.
>>    -
>>
>>    ObjC __weak references can have pointers to them from the ObjC
>>    runtime. You can’t pass a weak reference immediately as an argument because
>>    __weak is a qualifier and qualifiers are ignored in calls, but you
>>    can put one in a struct and pass that, and that struct has to be passed
>>    indirectly. Arguably such a parameter cannot be noalias because of
>>    the pointer from the runtime, but then again, ObjC code isn’t allowed to
>>    directly access the weak reference (it has to call the runtime), which
>>    means that no accesses that LLVM can actually see violate the noalias
>>    restriction.
>>    -
>>
>>    C++ parameters of non-trivially-copyable class type cannot be marked
>>    noalias: it is absolutely permitted to escape a pointer to this
>>    within a constructor and to replace that pointer whenever the object is
>>    moved. This is both well-defined and sometimes useful.
>>    -
>>
>>    It’s actually possible to escape a pointer to *any* C++ object within
>>    its constructor, and that pointer remains valid for the duration of the
>>    object’s lifetime. And you can do this with NRVO, too, so you don’t even
>>    need to have a type with non-trivial constructors, as long as the object
>>    isn’t copied. Note that this even messes up the C case, which is really
>>    unfortunate: arguably we need to pessimize C code because of the
>>    possibility it might interoperate with C++.
>>    -
>>
>>    But I think there’s an escape hatch here. C++ has a rule which is
>>    intended to give implementation extra leeway with passing and returning
>>    trivial types, e.g. to pass them in registers. This rule is C++
>>    [class.temporary]p3, which says that implementations can create an extra
>>    temporary object to pass an object of type X as long as “each copy
>>    constructor, move constructor, and destructor of X is either trivial or
>>    deleted, and X has at least one non-deleted copy or move constructor”. This
>>    object is created by (trivially) copy/move-initializing from the
>>    argument/return object. Arguably we can consider any type that satisfies
>>    this condition to be *formally* copied into a new object as part of
>>    passing or returning it. We don’t need to *actually* do the copy, I
>>    think, we just need to consider a copy to have been done in order to
>>    formally disrupt any existing pointers to the object. (Although arguably
>>    you aren’t allowed to copy an object into a new object at the original
>>    object’s current address; it would be an unfortunate consequence of this
>>    wording if we had to either forgo optimization or do an unnecessary copy
>>    here.)
>>
>> Thoughts?
>>
> From a high level: I think the C++ language semantics *should* permit us
> to assume that objects passed by value to functions, and objects returned
> by value from functions (in which category I include *this in a
> constructor), are noalias.
>

... specifically in the case where they're trivially copyable and the
implementation was permitted to make a copy. In the case of non-trivial
copy operations, I think we probably should be forced to assume that the
address of the object may have escaped.


> I think concretely, the escape hatch doesn't stop things from going wrong,
> because -- as you note -- even though we *could* have made a copy, it's
> observable whether or not we *did* make a copy. For example:
>
> #include <stdio.h>
>
> struct A {
>     A(A **where) : data{"hello world"} { *where = this; }
>     char data[65536];
> };
> A *p;
>
> [[gnu::noinline]]
> void f(A a) {
>     for (int i = 0; i != sizeof(A::data) - 2; ++i)
>         p->data[i+1] = a.data[i];
>     puts(a.data);
> }
>
> // elsewhere, perhaps compiled by a smarter compiler that doesn't make a
> copy here
> int main() { f({&p}); }
>
> I think it's valid for this program to print "hello world" or for it to
> print "hhhhhhhhhhhhh...", but it's not valid to (eg) turn the copy loop
> into a memcpy with undefined behavior.
>
> As it happens, we do actually make a redundant copy here when performing
> the call to `f`, which seems wasteful. And so do GCC and ICC, which means
> the 'noalias' would actually be correct here considering only the behavior
> of those compilers. So in principle we could address this in the ABI by
> saying that the copy is mandatory. But I don't think we should -- I think
> the above code should have undefined behavior because it accesses a
> function parameter through an access path not derived from the name of the
> function parameter.
>
> We do have some wording in the standard that tries to give aliasing
> guarantees in some of these cases, but does so in a way that's not really
> useful. Specifically, [class.cdtor]p2: "During the construction of an
> object, if the value of the object or any of its subobjects is accessed
> through a glvalue that is not obtained, directly or indirectly, from the
> constructor’s this pointer, the value of the object or subobject thus
> obtained is unspecified." (I mean, thanks for trying, but that's not all
> the cases, and "the value is unspecified" is not enough permission.)
>
> Maybe we could mark such cases as 'noalias', behind a known-non-conforming
> flag. The question would then be whether we enable it by default or not.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200729/023a0987/attachment.html>


More information about the cfe-dev mailing list