[PATCH] D135472: [ODRHash] Hash attributes on declarations.

Tue Oct 11 19:47:34 PDT 2022

vsapsai added inline comments.

================
Comment at: clang/include/clang/Basic/DiagnosticASTKinds.td:852
+def err_module_odr_violation_attribute : Error<
+  "%q0 has different definitions in different modules; first difference is "
+  "%select{definition in module '%2'|defined here}1 found"
----------------
erichkeane wrote:
> Wowzer these are tough to read... can you provide a magic-decoder ring for me of some sort?
Guess I went too far with avoiding repetition (was feeling soo smart doing it). I'll go back to the messages with a little bit more repetition but which should be easier to read and understand.

================
Comment at: clang/lib/AST/ODRDiagsEmitter.cpp:159

 bool ODRDiagsEmitter::diagnoseSubMismatchTypedef(
     const NamedDecl *FirstRecord, StringRef FirstModule, StringRef SecondModule,
----------------
aaron.ballman wrote:
> Typedefs can have attributes, so should this be updated as well? (alignment, ext vector types, mode attributes, etc all come to mind)
It should be tested for sure, thanks for pointing it out.

================
Comment at: clang/lib/AST/ODRDiagsEmitter.cpp:302
+    llvm::raw_string_ostream OutputStream(FullText);
+    A->printPretty(OutputStream, Ctx.getPrintingPolicy());
+    return OutputStream.str();
----------------
aaron.ballman wrote:
> Do we want to have more control over the printing policy here? e.g., do we really want to claim an ODR difference if one module's printing policy specifies indentation of 8 and another specifies indentation of 4? Or if one module prints `restrict` while another prints `__restrict`, etc?
`FullAttributeText` is used for diagnostics and not for the mismatch detection, so we shouldn't complain about `restrict/__ restrict` mismatches (`DifferentAlignmentKeywords` tests that `__attribute__((aligned(8)))` and `alignas(8)` are not a mismatch).

We use the same `ASTContext` to print both attributes, so that shouldn't be confusing. Diagnostic can be unstable across clang versions and probably across language modes. But I think that can happen to other diagnostic messages too, so think that's acceptable.

================
Comment at: clang/lib/AST/ODRDiagsEmitter.cpp:311
+           << FirstModule.empty() << FirstModule << (FirstContainer != nullptr)
+           << FirstDecl->getIdentifier() << DiffType;
+  };
----------------
aaron.ballman wrote:
> You can probably drop the `getIdentifier()` here as the diagnostics engine knows how to print named declarations properly already.
Thanks for the hint, will do.

================
Comment at: clang/lib/AST/ODRHash.cpp:479-480
+
+  llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs),
+                [](const Attr *A) { return !A->isImplicit(); });
+}
----------------
erichkeane wrote:
> aaron.ballman wrote:
> > ChuanqiXu wrote:
> > > vsapsai wrote:
> > > > I'm not sure `isImplicit` is the best indicator of attributes to check, so suggestions in this area are welcome. I think we can start strict and relax some of the checks if needed.
> > > > 
> > > > If people have strong opinions that some attributes shouldn't be ignored, we can add them to the tests to avoid regressions. Personally, I believe that alignment and packed attributes should never be silently ignored.
> > > Agreed. I feel `isImplicit` is enough for now.
> > The tricky part is -- sometimes certain attributes add additional implicit attributes and those implicit attributes matter (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380). And some attributes seem to just do the wrong thing entirely: https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344
> > 
> > So I think `isImplicit()` is a good approximation, but I'm more wondering what the principle is for whether an attribute should or should not be considered part of the ODR hash. Type attributes, attributes that impact object layout, etc all seem like they should almost certainly be part of ODR hashing. Others are a bit more questionable though.
> > 
> > I think this is something that may need a per-attribute flag in Attr.td so attributes can opt in or out of it because I'm not certain what ODR issues could stem from `[[maybe_unused]]` or `[[deprecated]]` disagreements across module boundaries.
> I don't think 'isImplicit' is particularly good.  I think the idea of auto-adding 'type' attributes and having the 'rest' be analyzed to figure out which are important.
> 
> Alternatively, I wonder if we're better off just adding ALL attributes and seeing what the fallout is. We can later decide when we don't want them to be ODR significant (which, might be OTHERWISE meaningful later!).
One criteria to decide which attributes should be hashed is if they affect IRGen. But that's not exhaustive and I'm not sure how practical it is.

The rule I'm trying to follow right now is if declarations with different attributes can be substituted instead of each other. For example, structs with different memory layout cannot be interchanged, so it is reasonable to reject them.

But maybe we should review attributes on case-by-case basis. For example, for `[[deprecated]]` I think the best for developers is not to complain about it but merge the attributes and then have regular diagnostic about using a deprecated entity.

================
Comment at: clang/lib/AST/ODRHash.cpp:479-480
+
+  llvm::copy_if(D->attrs(), std::back_inserter(HashableAttrs),
+                [](const Attr *A) { return !A->isImplicit(); });
+}
----------------
vsapsai wrote:
> erichkeane wrote:
> > aaron.ballman wrote:
> > > ChuanqiXu wrote:
> > > > vsapsai wrote:
> > > > > I'm not sure `isImplicit` is the best indicator of attributes to check, so suggestions in this area are welcome. I think we can start strict and relax some of the checks if needed.
> > > > > 
> > > > > If people have strong opinions that some attributes shouldn't be ignored, we can add them to the tests to avoid regressions. Personally, I believe that alignment and packed attributes should never be silently ignored.
> > > > Agreed. I feel `isImplicit` is enough for now.
> > > The tricky part is -- sometimes certain attributes add additional implicit attributes and those implicit attributes matter (https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L9380). And some attributes seem to just do the wrong thing entirely: https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDeclAttr.cpp#L7344
> > > 
> > > So I think `isImplicit()` is a good approximation, but I'm more wondering what the principle is for whether an attribute should or should not be considered part of the ODR hash. Type attributes, attributes that impact object layout, etc all seem like they should almost certainly be part of ODR hashing. Others are a bit more questionable though.
> > > 
> > > I think this is something that may need a per-attribute flag in Attr.td so attributes can opt in or out of it because I'm not certain what ODR issues could stem from `[[maybe_unused]]` or `[[deprecated]]` disagreements across module boundaries.
> > I don't think 'isImplicit' is particularly good.  I think the idea of auto-adding 'type' attributes and having the 'rest' be analyzed to figure out which are important.
> > 
> > Alternatively, I wonder if we're better off just adding ALL attributes and seeing what the fallout is. We can later decide when we don't want them to be ODR significant (which, might be OTHERWISE meaningful later!).
> One criteria to decide which attributes should be hashed is if they affect IRGen. But that's not exhaustive and I'm not sure how practical it is.
> 
> The rule I'm trying to follow right now is if declarations with different attributes can be substituted instead of each other. For example, structs with different memory layout cannot be interchanged, so it is reasonable to reject them.
> 
> But maybe we should review attributes on case-by-case basis. For example, for `[[deprecated]]` I think the best for developers is not to complain about it but merge the attributes and then have regular diagnostic about using a deprecated entity.
One option was to hash `isa<InheritedAttr>` attributes as they are "sticky" and should be more significant, so shouldn't be ignored. But I don't think this argument is particularly convincing.

At this stage I think we can still afford to add all attributes and exclude some as-needed because modules adoption is still limited. Though I don't have strong feelings about this approach, just getting more restrictive later can be hard.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135472/new/

https://reviews.llvm.org/D135472