[PATCH] D138939: [WIP][clang] adds a way to provide user-oriented reasons

Christopher Di Bella via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Dec 2 16:44:43 PST 2022


cjdb added a comment.

In D138939#3967397 <https://reviews.llvm.org/D138939#3967397>, @tschuett wrote:

> I do not ask you to do anything! I just noticed that you add a lot of `FormatXXXDiagnostic` functions. An alternativ design is to have one `FormatDiagnostic` function with a mode parameter. Then you can decide whether to print legacy or user-oriented reasons.
>
> If next year you invent another diagnostic, you can extend the enum and the `FormatDiagnostic` function.

Makes sense to me! Thank you for clarifying.

In D138939#3967715 <https://reviews.llvm.org/D138939#3967715>, @dblaikie wrote:

> Sorry if this is derailing, but I wonder/worry about a few things here:
>
> 1. Compounding structured output with phraseology - it seems like it might be worthwhile for these to be separate issues

I agree with you. The reason I've compounded the two is to demonstrate how this is supposed to be used (otherwise it really feels like a nothing burger where I've made change for the sake of change).

> (doesn't mean the terminal output has to say exactly the same thing - as you've mentioned, we could have some fields that aren't rendered in some modes (eg: maybe a terminal only gets the summary, and not the reason - or perhaps you can ask for reasons to be provided if you don't mind the verbosity)). In the example case include here - describing the template parameter kind mismatch seems OK for Clang's current textual diagnostic experience & I don't think that would need to be limited to only the SARIF output?

@rsmith and I spent a fair amount of time chatting about giving Clang a verbose diagnostic mode, and we agreed that would have issues:

1. If you wanted to use unstructured text in brief mode 90% of the time and then change when you need to clarify something, build systems are going to need to rebuild everything.
2. Because scripts almost certainly exist that consume the current diagnostics, I'm reluctant to alter those in any meaningful way. I even note this in the RFC, where I explicitly state that there are no current plans to bin the existing model.

> 2. Could the new phraseology be included in a separate/parallel diagnostic file, essentially akin to a translation? Clang never did get support for multiple languages in its diagnostics, but maybe this is the time to do that work? And then have this new phrasing as the first "translation" of sorts?

Yes and no. For strict a rephrasing, this would be possible, but reasons aren't necessarily going to be just a rephrasing (the one in this patch is not). I intend for the reason section to be a far richer field that contains information that we've never previously provided before: a translation file can't help there. However, using parallel diagnostic files might not be such a bad idea. How do we ensure the two don't get out of sync?

> 3. I'm not sure I'd commit to calling the current diagnostic experience "legacy" just yet &

You mentioned "terminal diagnostics" below. I like that a lot more than the original "unstructured reason" and "traditional reason", so let's go with that for now.

> maybe this is my broadest feedback: It'd be great to avoid a two-system situation with the intent to merge things back in later.

I don't know how we can without breaking the promise of backwards compatibility.

> I think what I'd like to see (though I'm far from the decider in any of this) would be that the existing underlying structured diagnostics system could have a new emission system, that produces SARIF instead of text on a terminal - same underlying structured representation, but providing that more directly (in SARIF/JSON/etc) so that it's not lossy in the conversion to an output format - essentially a machine-readable mode for clang's diagnostic output. (to a file, to the terminal, beside the existing terminal output or instead of it, whatever folks prefer) - this would, ideally, initially require no changes to diagnostic API usage across clang.

I //think// you may be asking for what I mentioned here <https://reviews.llvm.org/D138939#3966973>? That's doable and probably wise: I just need to add `-fdiagnostics-format=sarif-compatibility` (flag spelling to be bikeshed). (@tschuett it seems that this flag may happen regardless of whether you intended for it!)

> Diagnostic quality improvements, like the template parameter kind mismatch can be committed independently as desired.

I'll coordinate with you next week to address my concerns here, but if we //can// achieve this, I'll be really happy. I'll be happier again if we can make it so both formats can have the change too :)

> Pretty much all of these features could be implemented somewhat in parallel, each feature would be of value independently to varying degrees, and we wouldn't end up with a deprecated/legacy/not ready yet situation (the only "not ready yet" part might be adding the new diagnostic phrasings - perhaps initially there'd be some way to fallback to the traditional diagnostics, so that the whole database didn't need to be translated wholesale - and once it's all translated and there's a lot of user feedback on the direction, we could consider changing the diagnostic database default, perhaps eventually deprecating the old database, and eventually removing it - without major API upheaval/renaming/addition/removal)

Mostly agreement with the above wherever possible.



================
Comment at: clang/lib/Basic/DiagnosticIDs.cpp:524
+
+    // FIXME(spaceship): default when C++20 becomes the default
+    bool operator==(const DiagDesc &Other) const {
----------------
dblaikie wrote:
> I don't think LLVM generally uses `(xxx)` in FIXMEs, or if we do it's probably only for bug numbers and maybe developer user names (though the latter's not ideal - people come and go, bug numbers are forever (kinda)) - not sure what the "spaceship" in there is meant to communicate (I'm aware of the C++20 spaceship operator, but I'd expect the thing in the `()` to be some list of known entities that we keep track of somewhere?)
> 
> Maybe `// FIXME: Use spaceship operator in C++20`?
`FIXME(spaceship)` is really just to make it very easy to find. I can make a bug for it.


================
Comment at: clang/lib/Basic/DiagnosticIDs.cpp:812-825
 namespace {
-  struct WarningOption {
-    uint16_t NameOffset;
-    uint16_t Members;
-    uint16_t SubGroups;
-    StringRef Documentation;
-
-    // String is stored with a pascal-style length byte.
-    StringRef getName() const {
-      return StringRef(DiagGroupNames + NameOffset + 1,
-                       DiagGroupNames[NameOffset]);
-    }
-  };
-}
+struct WarningOption {
+  uint16_t NameOffset;
+  uint16_t Members;
+  uint16_t SubGroups;
+  StringRef Documentation;
+
----------------
dblaikie wrote:
> Unrelated/accidental reformatting? (please commit separately if it's being committed)
I do what `git clang-format HEAD~` asks me to do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138939/new/

https://reviews.llvm.org/D138939



More information about the cfe-commits mailing list