[PATCH] D35955: clang-format: Add preprocessor directive indentation

Wed Aug 16 01:20:44 PDT 2017

djasper added inline comments.

================
Comment at: unittests/Format/FormatTest.cpp:2281

 TEST_F(FormatTest, LayoutMacroDefinitionsStatementsSpanningBlocks) {
   verifyFormat("#define A \\\n"
----------------
mzeren-vmw wrote:
> mzeren-vmw wrote:
> > Experimenting with the patch locally I found that comment indentation is broken in some cases. Please add tests that cover comments. For example:
> > 
> > comment indented at code level as expected:
> >   void f() {
> >   #if 0
> >     // Comment
> >     code();
> >   #  define FOO 0
> >   #endif
> >   }
> > comment not indented at code level when there's a guard:
> >   #ifndef _SOMEFILE_H
> >   #define _SOMEFILE_H
> >   void f() {
> >   #if 0
> >   // Comment
> >     code();
> >   #  define FOO 0
> >   #endif
> >   }
> >   #endif
> > 
> > The `#define FOO 0` is required to trigger this behavior.
> Erik spent some time investigating issues with comment indentation. A couple of insights so far:
> 
> a) We need tests for wrapped macros (with trailing \ continuations). These need to include embedded comments.
> 
> b) It would be most natural to align comments with the following non-comment line. So a comment may be indented at "preprocessor level" or "code level". If indented at preprocessor level we have to handle the extra space introduced by the leading hash. This may require an extra bit per line to indicate whether the "aligned-to" line is preprocessor or code.
"#if 0" specifically might not be a good candidate for a test case as that should be treated like a comment. We should not try to format anything inside a "#if 0" and I'd be surprised if clang-format touched that at all. "#if A" or something is more appropriate.

================
Comment at: unittests/Format/FormatTest.cpp:2322-2331
+  // Test with include guards.
+  EXPECT_EQ("#ifndef _SOMEFILE_H\n"
+            "#define _SOMEFILE_H\n"
+            "code();\n"
+            "#endif",
+            format("#ifndef _SOMEFILE_H\n"
+                   "#define _SOMEFILE_H\n"
----------------
mzeren-vmw wrote:
> A DRY-er way to say this is:
>   auto WithGuard = "#ifndef _SOMEFILE_H\n"
>                    "#define _SOMEFILE_H\n"
>                    "code();\n"
>                    "#endif";
>   ASSERT_EQ(WithGuards, format(WithGuards, Style));
> 
I think _A... is actually not an identifier that is ok to use. Remove the leading "_".

================
Comment at: unittests/Format/FormatTest.cpp:2334
+  // after #ifndef.
+  EXPECT_EQ("#ifndef NOT_GUARD\n"
+            "#  define FOO\n"
----------------
mzeren-vmw wrote:
> 
> It would be nice to have a comment that explains why you cannot use verifyFormat.
Or even have a separate function that messes this up in some other way, not altering line breaks so that we can keep these tests short.

================
Comment at: unittests/Format/FormatTest.cpp:2405-2408
+  // Defect: We currently do not deal with cases where legitimate lines may be
+  // outside an include guard. Examples are #pragma once and
+  // #pragma GCC diagnostic, or anything else that does not change the meaning
+  // of the file if it's included multiple times.
----------------
euhlmann wrote:
> mzeren-vmw wrote:
> > We need to handle comments (like copyrights) before the include guard. There is an analogous problem with trailing blank lines, or trailing comments.
> > 
> > I think we need a small state machine: Init, Start, Open, Closed, NotGuard (terminal). `FoundIncludeGuardStart` should change from a bool to an enum to track this state. `PPMaybeIncludeGuard` can then drop it's "mabye". Whether or not it is null depends directly on the state. It does not determine state itself. While parsing, each line updates the state. If we get to `eof` in the Closed state then we have detected an include guard. ... Or similar logic....
> > 
> > Note that support for comments before the guard opens the door to support for #pragma once. It is tempting to add that, but that is a slippery slope. I would recommend leaving that as a defect that can be addressed later.
> @djasper @klimek
> Do you have any comments on this? I've begun to implement an enum/state machine but I'd like to know if you'd like me to implement a different way.
Most of the work here is done in the UnwrappedLineParser, which should skip over comments anyway. So yes, we should allow for these comments, but I don't think the logic needs to get much more complex because of that. I strongly doubt that a state machine is the right mechanism here.

And generally (without specific action to derive from that), I think we should try to err on the side of incorrectly finding header guards instead of incorrectly not recognizing a header guard. I think a not-indented #define somewhere is better than an incorrectly indented header guard.

https://reviews.llvm.org/D35955