[PATCH] D106030: [Clang] add support for error+warning fn attrs

Wed Aug 4 05:45:59 PDT 2021

aaron.ballman added inline comments.

================
Comment at: clang/include/clang/Basic/Attr.td:3819
+  let Args = [StringArgument<"UserDiagnostic">];
+  let Subjects = SubjectList<[Function], ErrorDiag>;
+  let Documentation = [ErrorAttrDocs];
----------------
nickdesaulniers wrote:
> aaron.ballman wrote:
> > nickdesaulniers wrote:
> > > aaron.ballman wrote:
> > > > ObjC methods as well?
> > > I guess I can add them, but I'm unfamiliar with the language. If there's no GNU implementation of an ObjC compiler, do we need to worry about GNU C extensions in ObjC?
> > There's nothing platform-specific about the semantics of these attributes, so GNU or not doesn't really matter. I'm fine holding off on ObjC methods until a user requests support for it, but given how these are generally useful attributes, I think it would make sense to support ObjC up front unless there's a burden from supporting it.
> Looking at the generated IR for ObjC method calls, I don't think we can support them (easily).
> 
> It looks like method calls in ObjC are lowered to direct calls to `objc_msg_lookup`+indirect CALLs, IIUC.  This BackendDiagnostic relies on observing direct calls during instruction selection. (I suspect ObjC supports runtime modification of classes?) So during instruction selection, we can't know that we're making a method call to a particular method (easily; maybe there's helpers for this though?).  We don't seem to remove the indirection even with optimizations enabled (making me think that methods can be swapped at runtime).
> 
> We could try to diagnose in the frontend, but this very very quickly falls apart if there's any control flow involved.  ie. maybe we could diagnose:
> ```
> void foo(void) { dontcall(); }
> ```
> but this whole BackendDiagnostic relies on the backend to run optimizations; the frontend couldn't diagnose:
> ```
> void foo(void) { if (myRuntimeAssertion) dontcall(); }
> ```
> which is the predominate use case for this fn attr.
> 
> That said, the patch as is works for functions (not ObjC methods) in objective-c mode already; would you like a test for that?
> 
> Marking as Done, please reopen this thread if I'm mistaken in this analysis.
> That said, the patch as is works for functions (not ObjC methods) in objective-c mode already; would you like a test for that?

No thanks, I think the existing C coverage handles that.

> Marking as Done, please reopen this thread if I'm mistaken in this analysis.

Thanks for checking this out, I think it's a good reason to say "let's deal with this later (if ever)".

================
Comment at: clang/lib/CodeGen/CodeGenAction.cpp:783
+  else
+    Diags.Report(DiagID) << CalleeName << D.getMsgStr();
+}
----------------
nickdesaulniers wrote:
> aaron.ballman wrote:
> > nickdesaulniers wrote:
> > > nickdesaulniers wrote:
> > > > aaron.ballman wrote:
> > > > > nickdesaulniers wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > I think the usability in this situation is a concern -- only having a function name but no source location information is pretty tricky. Do you have a sense for how often we would hit this case?
> > > > > > I don't think we ever encounter it when building the kernel sources. Perhaps I should make this an assertion on `LocCookie.isValid()` instead?
> > > > > > 
> > > > > > While clang should always attach a `srcloc` MetaData node, other frontend might not, so it's optional.  But this TU is strictly Clang.
> > > > > I thought backend optimization passes would drop source location information with some regularity, so I'm a bit hesitant to go with an assertion unless it turns out I'm wrong there. However, if the backend shouldn't ever lose *this* source location information, then I think it's good to make it an assertion.
> > > > Perhaps debug info, but this SourceLocation is squirreled away in a Metadata node attached to the call site.
> > > > 
> > > > Ah, but via further testing of @arsenm 's point ("what about indirect calls?"), this case is tricky:
> > > > ```
> > > > // clang -O2 foo.c
> > > > __attribute__((error("oh no foo")))
> > > > void foo(void);
> > > > 
> > > > void (*bar)(void);
> > > > 
> > > > void baz(void) {
> > > >   bar = foo;
> > > >   bar();
> > > > }
> > > > ```
> > > > since we did not emit the Metadata node on the call, but then later during optimizations we replaced the call to `bar` with a call to `foo`. At that point, the front end did not emit a Metadata node with source location information, so we can't reconstruct the call site properly for the diagnostic. Hmm...I'll need to think more about this case.
> > > One thing I was thinking of to support @arsenm's case:
> > > 
> > > We probably generally could support diagnosing indirect calls if `-g` was used to emit `DILocation` MD nodes.  I'd need to change this up from a custom `!srcloc` node with one `unsigned` that is a `SourceLocation` to use the the `DILocation` tuple of `line`, `column`, and `scope` then change the frontend's `BackendDiagnosticConsumer` to reinstantiate a `SourceLocation` from those (if possible).
> > > 
> > > The only thing is: is it bad if we can only diagnose with `-g` for indirect calls?
> > > 
> > > Otherwise I don't see how we can support diagnosing indirect calls. GCC can.  We'd perhaps need to emit `!srcloc` or `DILocation` nodes for EVERY indirect call from the frontend, and add logic to merge these or something during optimizations when we turn indirect calls into direct calls.  And I'm sure that would cause lots of churn in the LLVM source tree / in the tests. I would create a parent revision to precommit that.  But I think that's maybe too overkill?
> > > 
> > > The current behavior is that we don't diagnose indirect calls.  I'm happy to add documentation, or create child or parent revisions trying to tackle that, but I think if we can't provide accurate debug info, it's perhaps better not to diagnose at all.
> > > 
> > > I could change that so that we still emit the diagnostic, but don't show _where_ the callsite was in the sources as part of the diagnostic.  But I think it's perhaps better to simply not warn in that case when we can't provide line info.
> > > 
> > > But this can go either way, so I appreciate others' guidance and thoughts here.
> > > Perhaps debug info, but this SourceLocation is squirreled away in a Metadata node attached to the call site.
> > 
> > I thought optimization passes frequently drop metadata nodes?
> > I thought optimization passes frequently drop metadata nodes?
> 
> Perhaps, but I've yet to see Metadata attached to a static `call` get dropped.  This patch emits such Metadata attached to `call` IR Instructions when the frontend sees a static call of callee whose Decl has these GNU C fn attrs.
> 
> Perhaps when we de-indirect an indirect call to a direct call we don't carry over the Metadata (unsure), but it's moot because the frontend didn't know to attach the `dontcall` IR Fn attr because the frontend _cant_ de-indirect the call.
> 
> So it's not a case that de-indirection in the backend drops the Metadata node (though perhaps that's also an issue), moreso that the frontend didn't know to attach the metadata in the first place because it only saw an indirect call.
> 
> We could emit such Metadata for _all_ indirect calls from the frontend (overkill), diagnose only when Debug Info is requested (not great), diagnose but not provide line no of indirect call site (not great), simply not diagnose (not great, but also unlikely IMO for there to be indirect calls of such attributed functions), or maybe something else I'm not thinking of.  The current implementation chooses to not diagnose.  Is that the best choice? Definitely a clothespin vote.  Is there something better?  I was hoping you all would tell me. :)
> The current implementation chooses to not diagnose. Is that the best choice?

I think it's a reasonable choice for the initial implementation. We can improve it later when we have a concrete use case. My primary concern was addressed though -- I'd rather have a false negative than a diagnostic with no source location information. The false negative is unfortunate, but IMO better than a frustrating diagnostic without source location information. The upside to the frustrating diagnostic is that we'd get a bug report about that (it's easier to miss the false negative and file a bug report about that). I suppose one way to address that is to use an assertion before the early return with a comment that the assertion is speculative rather than assertive. So in asserts builds we get a loud report and in other builds, we get the false negatives. WDYT?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106030/new/

https://reviews.llvm.org/D106030