[PATCH] D124726: Suggest typoed directives in preprocessor conditionals

Ken Matsui via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat May 7 02:03:41 PDT 2022


ken-matsui added inline comments.


================
Comment at: clang/include/clang/Basic/DiagnosticLexKinds.td:362-365
 
+def warn_pp_typo_directive : Warning<
+  "'#%0' directive not found, did you mean '#%1'?">,
+  InGroup<TypoedDirective>;
----------------
aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > Rather than adding a warning over the top of an existing error, I think we should modify `err_pp_invalid_directive` to have an optional "did you mean?" when we find a plausible typo to correct.
> > > 
> > > However, we do not issue that diagnostic when it's inside of a skipped conditional block, and that's what the crux of https://github.com/llvm/llvm-project/issues/51598 is about. As @rsmith pointed out in that issue (and I agree), compilers are free to support custom directives and those will validly appear inside of a conditional block that is skipped. We need to be careful not to diagnose those kinds of situations as an error. However, we still want to diagnose when the unknown directive is "sufficiently close" to another one which can control the conditional chain. e.g.,
> > > ```
> > > #fi FOO // error unknown directive, did you mean #if?
> > > #endfi // error unknown directive, did you mean #endif?
> > > 
> > > #if FOO
> > > #esle // diag: unknown directive, did you mean #else?
> > > #elfi // diag: unknown directive, did you mean #elif?
> > > #elfidef // diag: unknown directive, did you mean #elifdef
> > > #elinfdef // diag: unknown directive, did you mean #elifndef
> > > 
> > > #frobble // No diagnostic, not close enough to a conditional directive to warrant diagnosing
> > > #eerp // No diagnostic, not close enough to a conditional directive to warrant diagnosing
> > > 
> > > #endif
> > > ```
> > > Today, if `FOO` is defined to a nonzero value, you'll get diagnostics for all of those, but if `FOO` is not defined or is defined to 0, then there's no diagnostics. I think we want to consider directives that are *very likely* to be a typo (edit distance of 1, maybe 2?) for a conditional directive as a special case.
> > > 
> > > Currently, we only diagnose unknown directives as an error. I think for these special cased conditional directive diagnostics, we'll want to use a warning rather than an error in this circumstance (just in case it turns out to be a valid directive in a skipped conditional block). If we do go that route and make it a warning, I think the warning group should be `-Wunknown-directives` to mirror `-Wunknown-pragmas`, `-Wunknown-attributes`, etc and it should be defined to have the same text as the error case. e.g., 
> > > ```
> > > def err_pp_invalid_directive : Error<
> > >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > > >;
> > > def warn_pp_invalid_directive : Warning<
> > >   err_pp_invalid_directive.Text>, InGroup<DiagGroup<"unknown-directives">>;
> > > ```
> > > WDYT?
> > > 
> > > (These were my thoughts before seeing the rest of the patch. After reading the patch, it looks like we have pretty similar ideas here, which is great, but leaving the comment anyway in case you have further opinions.)
> > > Currently, we only diagnose unknown directives as an error. I think for these special cased conditional directive diagnostics, we'll want to use a warning rather than an error in this circumstance (just in case it turns out to be a valid directive in a skipped conditional block). If we do go that route and make it a warning, I think the warning group should be `-Wunknown-directives` to mirror `-Wunknown-pragmas`, `-Wunknown-attributes`, etc and it should be defined to have the same text as the error case. e.g., 
> > > ```
> > > def err_pp_invalid_directive : Error<
> > >   "invalid preprocessing directive%select{|; did you mean '#%1'?}0"
> > > >;
> > > def warn_pp_invalid_directive : Warning<
> > >   err_pp_invalid_directive.Text>, InGroup<DiagGroup<"unknown-directives">>;
> > > ```
> > > WDYT?
> > > 
> > > (These were my thoughts before seeing the rest of the patch. After reading the patch, it looks like we have pretty similar ideas here, which is great, but leaving the comment anyway in case you have further opinions.)
> > 
> > For now, I totally agree with deriving a new warning from `err_pp_invalid_directive`.
> > 
> > However, for future scalability, I think it would be great if we could split those diagnostics into Error & Warning and Help, for example. Rustc does split the diagnostics like the following, and I think this is quite clear. So, a bit apart from this patch, I speculate creating a diagnostic system that can split them would bring Clang diagnostics much more readable.
> > 
> > https://github.com/rust-lang/rust/blob/598d89bf142823b5d84e2eb0f0f9e418ee966a4b/src/test/ui/suggestions/suggest-trait-items.stderr
> > 
> > However, for future scalability, I think it would be great if we could split those diagnostics into Error & Warning and Help, for example. Rustc does split the diagnostics like the following, and I think this is quite clear. So, a bit apart from this patch, I speculate creating a diagnostic system that can split them would bring Clang diagnostics much more readable.
> 
> CC @cjdb for awareness. (He's working on a project for improving the way in which we communicate diagnostics to users.)
Thank you for CCing him!

>>! In D124726#3497284, @cjdb wrote:
> Thanks for working on this! 

@cjdb 

You’re welcome!

I currently think rustc's diagnostics like the following are clear to users rather than attaching a help to a warning like `unknown directive found, did you mean #Hoge?`.

https://github.com/rust-lang/rust/blob/598d89bf142823b5d84e2eb0f0f9e418ee966a4b/src/test/ui/suggestions/suggest-trait-items.stderr

So, I think it would be better if we could implement diagnostics to be able to emit info like rustc does.
I also implemented a feature to show line numbers in diagnostics, and it may be under review.

https://reviews.llvm.org/D125078

Could I have your opinion here?


================
Comment at: clang/lib/Lex/PPDirectives.cpp:441-449
+                                          const SourceLocation &endLoc) {
+  const std::array<std::string, 8> candidates{
+      "if", "ifdef", "ifndef", "elif", "elifdef", "elifndef", "else", "endif"};
+
+  if (const auto sugg =
+          tooling::levdistance::findSimilarStr(candidates, Directive)) {
+    CharSourceRange DirectiveRange =
----------------
aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > Mostly just cleaning up for coding conventions, but also, no need to use a `std::array` and we typically don't use local top-level `const` qualification.
> > Thank you!
> > 
> > Just wondering, but is there any reason not to use the `const` qualifier?
> > Just wondering, but is there any reason not to use the const qualifier?
> 
> We have some quite inconsistent const correctness in our interfaces, so declaring local objects as const sometimes requires viral changes to make things compile. (I think we've mostly excised the uses of `const_cast` that would cast away constness in situations which weren't guaranteed to be safe in.)
Oh, I see thank you. This is a too difficult problem when working with many contributors though...


================
Comment at: clang/test/Preprocessor/suggest-typoed-directive.c:10
+// expected-warning at +11 {{'#elfidef' directive not found, did you mean '#elifdef'?}}
+// expected-warning at +11 {{'#elfindef' directive not found, did you mean '#elifdef'?}}
+// expected-warning at +11 {{'#elsi' directive not found, did you mean '#else'?}}
----------------
aaron.ballman wrote:
> ken-matsui wrote:
> > aaron.ballman wrote:
> > > It's interesting that this one suggested `#elifdef` instead of `#elifndef` -- is there anything that can be done for that?
> > > 
> > > Also, one somewhat interesting question is whether we want to recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives only exist in the latest language standard, but Clang supports them as a conforming extension in all language modes. Given that this diagnostic is about typos, I think I'm okay suggesting the directives even in older language modes. That's as likely to be a correct suggestion as not, IMO.
> > > It's interesting that this one suggested `#elifdef` instead of `#elifndef` -- is there anything that can be done for that?
> > 
> > I found I have to use `std::min_element` instead of `std::max_element` because we are finding the nearest (most minimum distance) string. Meanwhile, `#elfindef` has 2 distance with both `#elifdef` and `#elifndef`. After replacing `std::max_element` with `std::min_element`, I could suggest `#elifndef` from `#elfinndef`.
> > 
> > > Also, one somewhat interesting question is whether we want to recommend `#elifdef` and `#elifndef` outside of C2x/C++2b mode. Those directives only exist in the latest language standard, but Clang supports them as a conforming extension in all language modes. Given that this diagnostic is about typos, I think I'm okay suggesting the directives even in older language modes. That's as likely to be a correct suggestion as not, IMO.
> > 
> > I agree with you because Clang implements those directives, and the suggested code will also be compilable. I personally think only not compilable suggestions should be avoided. (Or, we might place additional info for outside of C2x/C++2b mode like `this is a C2x/C++2b feature but compilable on Clang`?)
> > 
> > ---
> > 
> > According to the algorithm of `std::min_element`, we only get an iterator of the first element even if there is another element that has the same distance. So, `#elfindef` only suggests `#elifdef` in accordance with the order of `Candidates`, and I don't think it is beautiful to depend on the order of candidates. I would say that we can suggest all the same distance like the following, but I'm not sure this is preferable:
> > 
> > ```
> > #elfindef // diag: unknown directive, did you mean #elifdef or #elifndef?
> > ```
> > 
> > I agree with you because Clang implements those directives, and the suggested code will also be compilable. I personally think only not compilable suggestions should be avoided. (Or, we might place additional info for outside of C2x/C++2b mode like this is a C2x/C++2b feature but compilable on Clang?)
> 
> I may be changing my mind on this a bit. I now see we don't issue an extension warning when using `#elifdef` or `#elifndef` in older language modes. That means suggesting those will be correct but only for Clang, and the user won't have any other diagnostics to tell them about the portability issue. And those particular macros are reasonably likely to be used in a header where the user is trying to aim for portability. So I'm starting to think we should only suggest those two in C2x mode (and we should probably add a portability warning for #elifdef and #elifndef, so I filed: https://github.com/llvm/llvm-project/issues/55306)
> 
> > I would say that we can suggest all the same distance like the following, but I'm not sure this is preferable:
> 
> The way we typically handle this is to attach FixIt hints to a note instead of to the diagnostic. This way, automatic fixes aren't applied (because there are multiple choices to pick from) but the user is still able to apply whichever fix they want in an IDE or other tool. It might be worth trying that approach (e.g., if there's only one candidate, attach it to the warning, and if there are two or more, emit a warning without a "did you mean" in it and use a new note for the fixit. e.g.,
> ```
> #elfindef // expected-warning {{invalid preprocessing directive}} \
>              expected-note {{did you mean '#elifdef'?}} \
>              expected-note {{did you mean '#elifndef'?}}
> ```
> WDYT?
> I may be changing my mind on this a bit. I now see we don't issue an extension warning when using `#elifdef` or `#elifndef` in older language modes. That means suggesting those will be correct but only for Clang, and the user won't have any other diagnostics to tell them about the portability issue. And those particular macros are reasonably likely to be used in a header where the user is trying to aim for portability. So I'm starting to think we should only suggest those two in C2x mode (and we should probably add a portability warning for #elifdef and #elifndef, so I filed: https://github.com/llvm/llvm-project/issues/55306)
>

Certainly, it would be less confusing to the user to avoid suggestions regarding those two.
I'm going to fix my code to avoid suggesting them in not C2x mode.

Thank you for submitting the issue, I also would like to work on it.

> The way we typically handle this is to attach FixIt hints to a note instead of to the diagnostic. This way, automatic fixes aren't applied (because there are multiple choices to pick from) but the user is still able to apply whichever fix they want in an IDE or other tool. It might be worth trying that approach (e.g., if there's only one candidate, attach it to the warning, and if there are two or more, emit a warning without a "did you mean" in it and use a new note for the fixit. e.g.,
> ```
> #elfindef // expected-warning {{invalid preprocessing directive}} \
>              expected-note {{did you mean '#elifdef'?}} \
>              expected-note {{did you mean '#elifndef'?}}
> ```
> WDYT?

This is really cool, but don't you care about the redundancy of `did you mean` in terms of the llvm team culture?
If not, I will implement notes like the above.


================
Comment at: llvm/include/llvm/ADT/StringRef.h:267
+      for (StringRef C : Candidates) {
+        size_t CurDist = edit_distance(C, false);
+        if (CurDist <= MaxDist) {
----------------
It seems my previous comment that I posted on deleted file disappeared.

Could you please check the following link comment?

https://reviews.llvm.org/D124726#3493222


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124726/new/

https://reviews.llvm.org/D124726



More information about the cfe-commits mailing list