[PATCH] D73380: [clang] Annotating C++'s `operator new` with more attributes

Sun Jan 26 13:25:33 PST 2020

lebedev.ri marked 3 inline comments as done.
lebedev.ri added inline comments.

================
Comment at: clang/lib/Sema/SemaDecl.cpp:14361
+/// attributes are applied to declarations.
+void Sema::AddKnownFunctionAttributesForReplaceableGlobalAllocationFunction(
+    FunctionDecl *FD) {
----------------
rsmith wrote:
> lebedev.ri wrote:
> > rsmith wrote:
> > > This should all be done by CodeGen, not by injecting source-level attributes.
> > I don't agree, can you explain why this should be done in codegen?
> Generally, it's a design goal for the Clang AST to represent the original program source (http://clang.llvm.org/docs/InternalsManual.html#faithfulness). But I suppose these attributes are all marked "implicit", and it's a good idea to express these hints uniformly to codegen and the static analyzer and so on... OK, I can make peace with doing this here.
> Generally, it's a design goal for the Clang AST to represent the original program source (http://clang.llvm.org/docs/InternalsManual.html#faithfulness).

I fully agree with that goal!

> But I suppose these attributes are all marked "implicit",

Yeah, this was a predefined goal here - they are added implicitly,
therefore in AST we should not pretend that they originate from source code..

> and it's a good idea to express these hints uniformly to codegen and the static analyzer and so on... 

Precisely.

> OK, I can make peace with doing this here.

Okay.

================
Comment at: clang/lib/Sema/SemaDecl.cpp:14414-14432
+  // C++2a [basic.stc.dynamic.allocation]p3:
+  //   For an allocation function [...], the pointer returned on a successful
+  //   call shall represent the address of storage that is aligned as follows:
+  //   (3.2),(3.3) Otherwise, [...] the storage is aligned for any object
+  //               that does not have new-extended alignment [...].
+  //
+  // NOTE: we intentionally always manifest this basic alignment, because it is
----------------
rsmith wrote:
> lebedev.ri wrote:
> > rsmith wrote:
> > > This is incorrect. The pointer returned by `operator new` is only suitably aligned for any object that does not have new-extended alignment **and is of the requested size**. And the pointer returned by `operator new[]` is suitably aligned for any object **that is no larger than the requested size**. (These are both different from the rule for `malloc`, which does behave as you're suggesting here.) For example:
> > > 
> > > Suppose the default new alignment and the largest fundamental alignment are both 16, and we try to allocate 12 bytes. Then:
> > > 
> > >  * `operator new` need only return storage that is 4-byte aligned (because that is the largest alignment that can be required by a type `T` with `sizeof(T) == 12`)
> > >  * `operator new` need only return storage that is 8-byte aligned (because that is the largest alignment that can be required by a type `T` with `sizeof(T) <= 12`)
> > >  * `malloc` must return storage that is 16-byte aligned (because that is the largest fundamental alignment)
> > So essentially, if we can't evaluate the requested byte count as a constant/constant range,
> > we must simply give up here, on the most interesting case of variable allocation size?
> > That is surprisingly extremely pessimizing from C++ :)
> > 
> > 
> The benefit from not requiring all allocations to be padded to a multiple of the largest fundamental alignment is probably worth a lot more than allowing the optimizer to assume a higher alignment for pointers from direct calls to `operator new` with a non-constant argument.
> 
> We can at least look for a constant argument in CodeGen and emit the alignment hint there. (Though it's probably better to do that in the middle-end, since the argument might be reduced to a constant via inlining, especially for uses of `std::allocator` and similar.)
Please correct me if i'm not grasping the pattern correctly:

> Suppose the default new alignment and the largest fundamental alignment are both 16, and we try to allocate 12 bytes. Then:
> * `operator new[]` need only return storage that is 8-byte aligned (because that is the largest alignment that can be required by a type `T` with `sizeof(T) <= 12)`

Because into 12 bytes, the largest fundamental type that fits is `long long int` or `double`,
both with size/alignment of 8; and we don't care about the remaining padding, if any?

> * `operator new` need only return storage that is 4-byte aligned (because that is the largest alignment that can be required by a type `T` with `sizeof(T) == 12)`

"We have exactly 12 bytes. If we treat this as an array, what is the larges type that could be stored into this array, **leaving no padding**?"

> The benefit from not requiring all allocations to be padded to a multiple of the largest fundamental alignment is probably worth a lot more than allowing the optimizer to assume a higher alignment for pointers from direct calls to operator new with a non-constant argument.

Hmm, true.

> We can at least look for a constant argument in CodeGen and emit the alignment hint there.

Yes, that will be better than nothing.

> (Though it's probably better to do that in the middle-end, since the argument might be reduced to a constant via inlining, especially for uses of std::allocator and similar.)

I would //like// to avoid doing that in middle-end, but if all else fails that will have to do.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73380/new/

https://reviews.llvm.org/D73380