[cfe-dev] Can indirect class parameters be noalias?

Mon Aug 3 18:45:13 PDT 2020

On 31 Jul 2020, at 19:50, Hal Finkel wrote:
> On 7/31/20 5:59 PM, James Y Knight wrote:
>> This discussion reminds me of an example I ran into a couple weeks 
>> ago, where the execution of the program is dependent precisely upon 
>> whether the ABI calls for the object to be passed indirectly, or in a 
>> register
>>
>> In the case where NVRO is triggered, the class member foo_ is 
>> fully-constructed on the first line of CreateFoo (despite appearing 
>> as if that's only constructing a local variable). In the case where 
>> the struct is small enough to fit in a register, NVRO does not apply, 
>> and in that case, foo_ isn't constructed until after CreateFoo 
>> returns.
>>
>> Therefore, I believe it's implementation-defined whether the 
>> following program has undefined behavior.
>>
>> https://godbolt.org/z/YT9zsz <https://godbolt.org/z/YT9zsz>
>>
>> #include <assert.h>
>>
>> struct Foo {
>>     int x;
>> *    // assert fails if you comment out these unused fields!
>> *    int dummy[4];
>> };
>>
>> struct Bar {
>>     Bar() : foo_(CreateFoo()) {}
>>
>>     Foo CreateFoo() {
>>         Foo f;
>>         f.x = 55;
>>         assert(foo_.x == 55);
>>         return f;
>>     }
>>     Foo foo_;
>> };
>>
>> int main() {
>>     Bar b;
>> }
>
>
> Looks that way to me too. The example in 11.10.5p2 sort of makes this 
> point as well (by pointing out that you can directly initialize a 
> global this way).

It does seem hard to argue that this is invalid under the specification. 
  To me it seems like it clearly *ought* to be invalid, though.  Note 
that Clang has always emitted return address arguments as `noalias`, so 
this has immediate significance.

If I were writing the specification, I would rewrite the restriction in 
`[class.cdtor]p2` to say that pointers derived by naming a 
returned/constructed object do not formally point to the object until 
the function actually returns, even if the copy is elided.  That would 
make James’s example undefined behavior.

John.

>
>  -Hal
>
>
>>
>> On Fri, Jul 31, 2020 at 2:27 PM Hal Finkel via cfe-dev 
>> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>
>>
>>     On 7/31/20 1:24 PM, Hal Finkel wrote:
>>>     On 7/31/20 12:43 PM, John McCall wrote:
>>>>
>>>>     n 31 Jul 2020, at 7:35, Hal Finkel wrote:
>>>>
>>>>         On 7/29/20 9:00 PM, John McCall via cfe-dev wrote:
>>>>
>>>>             On 29 Jul 2020, at 17:42, Richard Smith wrote:
>>>>
>>>>             On Wed, 29 Jul 2020 at 12:52, John McCall
>>>>             <rjmccall at apple.com> <mailto:rjmccall at apple.com> wrote:
>>>>
>>>>             ...
>>>>
>>>>             I think concretely, the escape hatch doesn't stop 
>>>> things
>>>>             from
>>>>             going wrong,
>>>>             because -- as you note -- even though we *could* have
>>>>             made a copy,
>>>>             it's
>>>>             observable whether or not we *did* make a copy. For 
>>>> example:
>>>>
>>>>             I would say that it’s observable whether the 
>>>> parameter
>>>>             variable has
>>>>             the same address as the argument. That doesn’t /have/ 
>>>> to
>>>>             be the same
>>>>             question as whether a copy was performed: we could
>>>>             consider there to be
>>>>             a formal copy (or series of copies) that ultimately
>>>>             creates /an/ object
>>>>             at the same address, but it’s not the /same/ object 
>>>> and
>>>>             so pointers
>>>>             to the old object no longer validly pointer to it. But 
>>>> I
>>>>             guess that
>>>>             would probably violate the lifetime rules, because it
>>>>             would make accesses
>>>>             through old pointers UB when in fact they should at
>>>>             worst access a valid
>>>>             object that’s just unrelated to the parameter object.
>>>>
>>>>         I think that it would be great to be able to do this, but
>>>>         unfortunately, I think that the point that you raise here 
>>>> is
>>>>         a key issue. Whether or not the copy is performed is 
>>>> visible
>>>>         in the model, and so we can't simply act as though there 
>>>> was
>>>>         a copy when optimizing. Someone could easily have code that
>>>>         looks like:
>>>>
>>>>         Foo DefaultX;
>>>>
>>>>         ...
>>>>
>>>>         void something(Foo &A, Foo &B) {
>>>>
>>>>           if (&A == &B) { ... }
>>>>
>>>>         }
>>>>
>>>>         void bar(Foo X) { something(X, DefaultX); }
>>>>
>>>>     This example isn’t really on point; a call like 
>>>> |bar(DefaultX)|
>>>>     obviously cannot just pass the address of |DefaultX| as a
>>>>     by-value argument without first proving a lot of stuff about 
>>>> how
>>>>     |foo| uses both its parameter and |DefaultX|. I think |noalias|
>>>>     is actually a subset of what would have to be proven there.
>>>>
>>>
>>>     Yes, I apologize. You're right: my pseudo-code missed the point.
>>>     So the record is clear, let me rephrase:
>>>
>>>     Foo *DefaultX = nullptr;
>>>     ...
>>>     Foo::Foo() { if (!DefaultX) DefaultX = this; }
>>>     ...
>>>     void bar(Foo X) { something(X, *DefaultX); }
>>>     ...
>>>     bar(Foo{});
>>>
>>>     I think that's closer to what we're talking about.
>>>
>>>
>>>>     In general, the standard is clear that you cannot rely on
>>>>     escaping a pointer to/into a trivially-copyable pr-value
>>>>     argument prior to the call and then rely on that pointer
>>>>     pointing into the corresponding parameter object.
>>>>     Implementations are /allowed/ to introduce copies. But it does
>>>>     seem like the current wording would allow you to rely on that
>>>>     pointer pointing into /some/ valid object, at least until the
>>>>     end of the caller’s full-expression. That means that, if we
>>>>     don’t guarantee to do an actual copy of the argument, we 
>>>> cannot
>>>>     make it UB to access the parameter variable through pointers to
>>>>     the argument temporary, which is what marking the parameter as
>>>>     |noalias| would do.
>>>>
>>>>     So I guess the remaining questions are:
>>>>
>>>>       * Is this something we can reasonably change in the standard?
>>>>
>>>
>>>     This is the part that I'm unclear about. What change would we 
>>> make?
>>>
>>>
>>
>>     Also, maybe some extended use of the no_unique_address attribute
>>     would help?
>>
>>      -Hal
>>
>>
>>>
>>>>       * Are we comfortable setting |noalias| in C if the only place
>>>>         that would break is with a C++ caller?
>>>>
>>>
>>>     Out of curiosity, if you take C in combination with our
>>>     statement-expression extension implementation
>>>     (https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
>>>     <https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html>), and
>>>     notwithstanding the statement in the GCC manual about returns by
>>>     value (i.e., the part just before where it says, "Therefore the
>>>     this pointer observed by Foo is not the address of a."), is 
>>> there
>>>     any relationship to this topic?
>>>
>>>     Thanks again,
>>>
>>>     Hal
>>>
>>>
>>>>     John.
>>>>
>>>>         As Richard's example shows, the code doesn't need to
>>>>         explicitly compare the addresses to detect the copy either.
>>>>         Any code that reads/writes to the objects can do it. A
>>>>         perhaps-more-realistic example might be:
>>>>
>>>>           int Cnt = A.RefCnt; ++A.RefCnt; ++B.RefCnt; if (Cnt + 1 
>>>> !=
>>>>         A.RefCnt) { /* same object case */ }
>>>>
>>>>         The best suggestion that I have so far is that we could add
>>>>         an attribute like 'can_copy' indicating that the optimizer
>>>>         can make a formal copy of the argument in the callee and 
>>>> use
>>>>         that instead of the original pointer if that seems useful. 
>>>> I
>>>>         can certainly imagine a transformation such as LICM making
>>>>         use of such a thing (although the cost modeling would
>>>>         probably need to be fairly conservative).
>>>>
>>>>          -Hal
>>>>
>>>>             ...
>>>>
>>>>             John.
>>>>
>>>>
>>>>             _______________________________________________
>>>>             cfe-dev mailing list
>>>>             cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>             https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>             <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>>
>>>>         -- 
>>>>         Hal Finkel
>>>>         Lead, Compiler Technology and Programming Languages
>>>>         Leadership Computing Facility
>>>>         Argonne National Laboratory
>>>>
>>>     -- 
>>>     Hal Finkel
>>>     Lead, Compiler Technology and Programming Languages
>>>     Leadership Computing Facility
>>>     Argonne National Laboratory
>>
>>     -- 
>>     Hal Finkel
>>     Lead, Compiler Technology and Programming Languages
>>     Leadership Computing Facility
>>     Argonne National Laboratory
>>
>>     _______________________________________________
>>     cfe-dev mailing list
>>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>     <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200803/4c3c2df9/attachment.html>