[cfe-dev] Can indirect class parameters be noalias?

Mon Aug 3 20:28:38 PDT 2020

On 8/3/20 8:45 PM, John McCall wrote:
>
> On 31 Jul 2020, at 19:50, Hal Finkel wrote:
>
>     On 7/31/20 5:59 PM, James Y Knight wrote:
>
>         This discussion reminds me of an example I ran into a couple
>         weeks ago, where the execution of the program is dependent
>         precisely upon whether the ABI calls for the object to be
>         passed indirectly, or in a register
>
>         In the case where NVRO is triggered, the class member foo_ is
>         fully-constructed on the first line of CreateFoo (despite
>         appearing as if that's only constructing a local variable). In
>         the case where the struct is small enough to fit in a
>         register, NVRO does not apply, and in that case, foo_ isn't
>         constructed until after CreateFoo returns.
>
>         Therefore, I believe it's implementation-defined whether the
>         following program has undefined behavior.
>
>         https://godbolt.org/z/YT9zsz <https://godbolt.org/z/YT9zsz>
>         <https://godbolt.org/z/YT9zsz <https://godbolt.org/z/YT9zsz>>
>
>         #include <assert.h>
>
>         struct Foo {
>             int x;
>         *    // assert fails if you comment out these unused fields!
>         *    int dummy[4];
>         };
>
>         struct Bar {
>             Bar() : foo_(CreateFoo()) {}
>
>             Foo CreateFoo() {
>                 Foo f;
>                 f.x = 55;
>                 assert(foo_.x == 55);
>                 return f;
>             }
>             Foo foo_;
>         };
>
>         int main() {
>             Bar b;
>         }
>
>     Looks that way to me too. The example in 11.10.5p2 sort of makes
>     this point as well (by pointing out that you can directly
>     initialize a global this way).
>
> It does seem hard to argue that this is invalid under the 
> specification. To me it seems like it clearly /ought/ to be invalid, 
> though. Note that Clang has always emitted return address arguments as 
> |noalias|, so this has immediate significance.
>
> If I were writing the specification, I would rewrite the restriction 
> in |[class.cdtor]p2| to say that pointers derived by naming a 
> returned/constructed object do not formally point to the object until 
> the function actually returns, even if the copy is elided. That would 
> make James’s example undefined behavior.
>
> John.
>

I agree. It seems like we should be able to make a sanitizer detect this 
kind of mistake as well (although the general case will require some 
msan-like propagation scheme).

  -Hal

>      -Hal
>
>         On Fri, Jul 31, 2020 at 2:27 PM Hal Finkel via cfe-dev
>         <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org
>         <mailto:cfe-dev at lists.llvm.org>>> wrote:
>
>
>         On 7/31/20 1:24 PM, Hal Finkel wrote:
>
>             On 7/31/20 12:43 PM, John McCall wrote:
>
>                 n 31 Jul 2020, at 7:35, Hal Finkel wrote:
>
>                 On 7/29/20 9:00 PM, John McCall via cfe-dev wrote:
>
>                 On 29 Jul 2020, at 17:42, Richard Smith wrote:
>
>                 On Wed, 29 Jul 2020 at 12:52, John McCall
>                 <rjmccall at apple.com> <mailto:rjmccall at apple.com
>                 <mailto:rjmccall at apple.com>> wrote:
>
>                 ...
>
>                 I think concretely, the escape hatch doesn't stop things
>                 from
>                 going wrong,
>                 because -- as you note -- even though we *could* have
>                 made a copy,
>                 it's
>                 observable whether or not we *did* make a copy. For
>                 example:
>
>                 I would say that it’s observable whether the parameter
>                 variable has
>                 the same address as the argument. That doesn’t /have/ to
>                 be the same
>                 question as whether a copy was performed: we could
>                 consider there to be
>                 a formal copy (or series of copies) that ultimately
>                 creates /an/ object
>                 at the same address, but it’s not the /same/ object and
>                 so pointers
>                 to the old object no longer validly pointer to it. But I
>                 guess that
>                 would probably violate the lifetime rules, because it
>                 would make accesses
>                 through old pointers UB when in fact they should at
>                 worst access a valid
>                 object that’s just unrelated to the parameter object.
>
>                 I think that it would be great to be able to do this, but
>                 unfortunately, I think that the point that you raise
>                 here is
>                 a key issue. Whether or not the copy is performed is
>                 visible
>                 in the model, and so we can't simply act as though
>                 there was
>                 a copy when optimizing. Someone could easily have code
>                 that
>                 looks like:
>
>                 Foo DefaultX;
>
>                 ...
>
>                 void something(Foo &A, Foo &B) {
>
>                   if (&A == &B) { ... }
>
>                 }
>
>                 void bar(Foo X) { something(X, DefaultX); }
>
>                 This example isn’t really on point; a call like
>                 |bar(DefaultX)|
>                 obviously cannot just pass the address of |DefaultX| as a
>                 by-value argument without first proving a lot of stuff
>                 about how
>                 |foo| uses both its parameter and |DefaultX|. I think
>                 |noalias|
>                 is actually a subset of what would have to be proven
>                 there.
>
>             Yes, I apologize. You're right: my pseudo-code missed the
>             point.
>             So the record is clear, let me rephrase:
>
>             Foo *DefaultX = nullptr;
>             ...
>             Foo::Foo() { if (!DefaultX) DefaultX = this; }
>             ...
>             void bar(Foo X) { something(X, *DefaultX); }
>             ...
>             bar(Foo{});
>
>             I think that's closer to what we're talking about.
>
>                 In general, the standard is clear that you cannot rely on
>                 escaping a pointer to/into a trivially-copyable pr-value
>                 argument prior to the call and then rely on that pointer
>                 pointing into the corresponding parameter object.
>                 Implementations are /allowed/ to introduce copies. But
>                 it does
>                 seem like the current wording would allow you to rely
>                 on that
>                 pointer pointing into /some/ valid object, at least
>                 until the
>                 end of the caller’s full-expression. That means that,
>                 if we
>                 don’t guarantee to do an actual copy of the argument,
>                 we cannot
>                 make it UB to access the parameter variable through
>                 pointers to
>                 the argument temporary, which is what marking the
>                 parameter as
>                 |noalias| would do.
>
>                 So I guess the remaining questions are:
>
>                 * Is this something we can reasonably change in the
>                 standard?
>
>             This is the part that I'm unclear about. What change would
>             we make?
>
>         Also, maybe some extended use of the no_unique_address attribute
>         would help?
>
>          -Hal
>
>                 * Are we comfortable setting |noalias| in C if the
>                 only place
>                 that would break is with a C++ caller?
>
>             Out of curiosity, if you take C in combination with our
>             statement-expression extension implementation
>             (https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
>             <https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html>
>             <https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
>             <https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html>>),
>             and
>             notwithstanding the statement in the GCC manual about
>             returns by
>             value (i.e., the part just before where it says,
>             "Therefore the
>             this pointer observed by Foo is not the address of a."),
>             is there
>             any relationship to this topic?
>
>             Thanks again,
>
>             Hal
>
>                 John.
>
>                 As Richard's example shows, the code doesn't need to
>                 explicitly compare the addresses to detect the copy
>                 either.
>                 Any code that reads/writes to the objects can do it. A
>                 perhaps-more-realistic example might be:
>
>                   int Cnt = A.RefCnt; ++A.RefCnt; ++B.RefCnt; if (Cnt
>                 + 1 !=
>                 A.RefCnt) { /* same object case */ }
>
>                 The best suggestion that I have so far is that we
>                 could add
>                 an attribute like 'can_copy' indicating that the optimizer
>                 can make a formal copy of the argument in the callee
>                 and use
>                 that instead of the original pointer if that seems
>                 useful. I
>                 can certainly imagine a transformation such as LICM making
>                 use of such a thing (although the cost modeling would
>                 probably need to be fairly conservative).
>
>                  -Hal
>
>                 ...
>
>                 John.
>
>
>                 _______________________________________________
>                 cfe-dev mailing list
>                 cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org
>                 <mailto:cfe-dev at lists.llvm.org>>
>                 https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>                 <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>                 <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>                 <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>
>                 -- 
>                 Hal Finkel
>                 Lead, Compiler Technology and Programming Languages
>                 Leadership Computing Facility
>                 Argonne National Laboratory
>
>             -- 
>             Hal Finkel
>             Lead, Compiler Technology and Programming Languages
>             Leadership Computing Facility
>             Argonne National Laboratory
>
>         -- 
>         Hal Finkel
>         Lead, Compiler Technology and Programming Languages
>         Leadership Computing Facility
>         Argonne National Laboratory
>
>         _______________________________________________
>         cfe-dev mailing list
>         cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org
>         <mailto:cfe-dev at lists.llvm.org>>
>         https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>         <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>         <https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200803/59863508/attachment-0001.html>