[PATCH] D104601: [Preprocessor] Implement -fnormalize-whitespace.

Michael Kruse via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Jul 18 22:05:19 PDT 2021


Meinersbur marked an inline comment as done.
Meinersbur added a comment.

In D104601#2877855 <https://reviews.llvm.org/D104601#2877855>, @aaron.ballman wrote:

> In D104601#2848366 <https://reviews.llvm.org/D104601#2848366>, @Meinersbur wrote:
>
>> In D104601#2847400 <https://reviews.llvm.org/D104601#2847400>, @aaron.ballman wrote:
>>
>>> ... would add that it's very common for implementers to ask developers to run their code through `-E` mode to submit preprocessed output in order to reproduce a customer issue with the compiler, and I worry that uses of this flag will have unintended consequences in that scenario.
>>
>> Why would one add `-fnormalize-whitespace` for this use case?
>
> I don't think they'd *add* it, I worry it's already used by their build system for reasons unknown and the developer replicates it when reporting the bug because they're not looking at what flags can be removed.

I made the flag an error if used without `-E`, so the build system cannot add it.

Flags such as `-P`, `-fuse-line-directives`, `-frewrite-imports` and `-frewrite-includes` are also silently ignored without `-E`. This requirement seems exclusive for `-felide-unnecessary-whitespace`. It also suggests that it is not a problematic scenario.

>>> The "very long line" example mentioned by @dblaikie is sort of along these lines (sorry for the bad pun).
>>
>> I can add forced line breaks (after/before 80 cols? configurable?) if requested.
>
> I don't think that's necessary just yet. If there's an issue in practice, it seems like we could address it once we have the real world use in front of us. However, It'd help my confidence if you were able to run this functionality over a large corpus of code to see if any issues do pop out, if that's plausible for you.

I ran it (using ccache) over llvm-project and the llvm-test-suite. Is there another large corpus do you have in mind?

> I'd like to see it diagnosed when used outside of a preprocessor invocation because it really serves no purpose there.

I made the flag an error if not used without `-E`.

> I also think we should rename it because "normalize whitespace" can be ambiguous depending on what context you're reading the words in.

Changed to your suggested `-felide-unnecessary-whitespace`, although I think name will not allow use to insert whitespace e.g. for keeping the line length down. If that's not required, I would prefer `-fminimize-whitespace` over -felide-unnecessary-whitespace`.



================
Comment at: clang/docs/ClangCommandLineReference.rst:2484
+-P option to normlize whitespace such that two files with only formatted
+changes are equal.
+
----------------
aaron.ballman wrote:
> Should this option be ignored when not performing a preprocessing action (and documented as such)?
I changed it to an error if used without `-E`.


================
Comment at: clang/include/clang/Driver/Options.td:1789
   PosFlag<SetTrue, [CC1Option], "Use #line in preprocessed output">, NegFlag<SetFalse>>;
+defm normalize_whitespace : BoolFOption<"normalize-whitespace",
+  PreprocessorOutputOpts<"NormalizeWhitespace">, DefaultFalse,
----------------
aaron.ballman wrote:
> I think I'd feel most comfortable if this didn't use "normalize" in the name. When I hear that, I think it's going to be doing something like converting all (perhaps funky Unicode) notions of whitespace into ASCII space characters. But this option doesn't actually do that -- it's removing as much whitespace from the preprocessed output as possible. Would it be better named as `-felide-unnecessary-whitespace` or something along those lines?
I don't think it is ambiguous what whitespace means for the preprocessor. Currently, `-E` already converts into `'\n'` and `'\n'` only.

I don't like the suggested name but still applied it to this patch. I'd prefer `-fminimize-whitespace`. However, the name would be wrong if we add additional whitespace such as line-breaks to avoid the long line problem, of we unconditionally always emit a space between tokens to not having to call `AvoidConcat` which would still serve the same goal.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104601/new/

https://reviews.llvm.org/D104601



More information about the cfe-commits mailing list