[cfe-commits] [PATCH] -rewrite-includes

David Blaikie dblaikie at gmail.com
Wed Jun 6 11:53:45 PDT 2012


On Tue, Jun 5, 2012 at 4:54 PM, Matt Beaumont-Gay <matthewbg at google.com> wrote:
> Looks good, just a few style nits:
>
> +/// Count the raw \n characters in the \p Len characters from \p Pos.
>
> Does Doxygen do the right thing with "\n"?

Looks like it doesn't. We had one instances of this already & it does
render incorrectly in doxygen (
http://clang.llvm.org/doxygen/Lexer_8cpp.html#a6063f41740159ecf3e621f8cb4a86332
- that should read "(is either \n or \r)") & what I could find says
that it can be escaped with either "\\n" or "@\n" - I chose the former
& I'll fix Lexer.cpp (r158091) as well as this instance in
InclusionRewriter.cpp

> +  if (Pos+1 < FromFile.getBufferEnd() && Pos[1] == '\r')
>
> Whitespace around binary operators (and below).

Added.

> +  Line += CountNewLines(FromFile.getBufferStart() + WriteFrom,
> +    WriteTo - WriteFrom);
>
> Prevailing style is to indent the line continuation to the opening
> paren of the function call.

Indented

> +  const char *FileName = FromFile.getBufferIdentifier();
>
> Trailing whitespace.

Removed

> +  // `1' indicates the start of a new file
>
> The backticks in these comments keep throwing me off... maybe here
> (which is a ways below the link to the GNU preprocessor docs), add
> "Per the GNU docs:" or such.

Added "Per the GNU docs: " and changed the backticks to normal quotation marks.

> +            // keep the directive in, commented out
>
> Per the style guide: "When writing comments, write them as English
> prose, which means they should use proper capitalization, punctuation,
> etc."

Removed the comment as the code seemed self explanatory (The statement
"Comment out directive." was entirely redundant with the name of the
function and it seems clean that commenting it out doesn't remove it
entirely, just leaves it commented out).

Committed as r158093.

Thanks,
- David

>
> On Thu, May 17, 2012 at 1:12 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> I've updated this patch based on the feedback as follows:
>>
>> On Tue, Apr 3, 2012 at 5:32 PM, Matt Beaumont-Gay <matthewbg at google.com> wrote:
>>> Looking better.
>>>
>>> On Sat, Mar 31, 2012 at 14:32, Lubos Lunak <l.lunak at suse.cz> wrote:
>>>>> >  Clang does not even bother to perform repeated inclusion of the same
>>>>> > file if it somehow detects it's unnecessary, but the avoided
>>>>> > FileChanged() callback would mean -rewrite-includes would otherwise leave
>>>>> > some #include directives in the result that would be performed the next
>>>>> > time.
>>>>>
>>>>> Can you use the FileSkipped() callback to make this any more efficient?
>>>>
>>>> No. If you are asking about the quoted part above, then it's not about
>>>> efficiency, it's about correct/incorrect. If it was a question in general,
>>>> then it's more efficient (and probably also correct, given the action is
>>>> supposed to rewrite includes) to just comment out everything instead of
>>>> figuring out exactly what needs it or not.
>>>
>>> So, any #include directive should result in either a FileChanged() or
>>> FileSkipped() callback. Build up a vector with a record of what
>>> happened for each #include. Then, when you re-lex in Process(), just
>>> keep an iterator into the vector, and increment it every time you see
>>> another #include.
>>
>> I went a slightly different route, though I'm not wedded to it & could
>> be persuaded in favor of this.
>>
>> I took advantage of the FileSkipped callback to avoid what I think was
>> some undefined behavior (using the 'Id' and 'FileType' members of a
>> FileChange that had never been initialized because the FileChanged
>> callback never occured - this seemed to get lucky enough & bail out in
>> the validity check at the start of RewriteIncludes::Process, which
>> I've since replaced with an assert) - by removing the 'pending' entry
>> from the map whenever a FileSkipped callback is received.
>>
>>>>> A few new comments, in addition to the general style issues which are
>>>>> still unaddressed (doxygen, whitespace, etc):
>>>>
>>>>  I have already converted function/type comments into doxygen comments, and
>>>> there's no point in converting code comments. If you meant to say something
>>>> else with "Doxygen comments throughout", you didn't actually say it.
>>>
>>> Functions and methods should have doxyments explaining the purpose of
>>> the function and the meaning of each parameter. Class data members
>>> should also have (brief) doxyments.
>>
>> Doxycomments added.
>>
>>> A few more things, mostly nits:
>>>
>>> Throughout, be consistent about whether the * or & is next to the type
>>> or the variable. Prevalent style in Clang is to put it next to the
>>> variable.
>>
>> Consistified * and & as per LLVM coding style.
>>
>>> Comments should begin with a capital letter (and if they're complete
>>> sentences, end with a period.)
>>>
>>>  bool Process(FileID FileId, SrcMgr::CharacteristicKind Type);
>>>
>>> 'Type' is already pretty overloaded in Clang; probably should call
>>> this 'FileType.'
>>
>> Renamed 'Type' to 'FileType'.
>>
>>> ::IncludeRewriter::IncludeRewriter(Preprocessor &pp, raw_ostream &os,
>>>  bool lineMarkers)
>>>  : PP(pp), SM(PP.getSourceManager()), OS(os),
>>>  DisableLineMarkers(lineMarkers), LastInsertedFileChange(0) {
>>>
>>> The constructor parameter should probably be called
>>> 'DisableLineMarkers' so it's obvious that passing 'true' does not mean
>>> "yes, I want line markers."
>>
>> Corrected (& renamed to ShowLineMarkers (& actually implemented the
>> semantics - the flag was passed but unused) so as not to needlessly
>> invert the name/logic compared to the existing
>> PreprocessorOutputOptions) and added a test case for this flag.
>>
>>> Also, the continuation of the parameter list should be indented to the
>>> opening paren, similarly to most of the other method definitions;
>>> likewise for argument lists on function calls throughout. And, no need
>>> for the leading '::' here.
>>>
>>>      OS << " 3";
>>>
>>> These magic strings should have comments; bonus points for references
>>> to authoritative documentation.
>>
>> Commented with quotes and references to authoritative documentation.
>>
>>>
>>>  if (Reason == EnterFile) {
>>>
>>> Invert the condition and return early:
>>> http://llvm.org/docs/CodingStandards.html#hl_earlyexit
>>
>> Inverted the condition and returned early.
>>
>>>
>>>        FileChange& Ref = FileChanges[LastInsertedFileChange];
>>>
>>> Name this variable 'Change' or such.
>>
>> Variable removed in favor of alternative implementation.
>>
>>>
>>>  FileChangeMap ::const_iterator Find = FileChanges.find(Loc.getRawEncoding());
>>>
>>> 'I' (or 'Result' if you're feeling verbose).
>>
>> Variable renamed to 'I'.
>>
>>>
>>>  if(Find != FileChanges.end())
>>>
>>> Space between 'if' and '(':
>>> http://llvm.org/docs/CodingStandards.html#micro_spaceparen
>>
>> Spaces added before condition '(' across all uses.
>>
>>> /// Copies next yet written file content up (and not including writeEnd).
>>>
>>> This could probably be worded a little more clearly. Maybe "Writes
>>> bytes from FromFile, starting at \p NextToWrite and ending at \p
>>> WriteEnd - 1."
>>
>> Rephrased as suggested (& some renaming too).
>>
>>> void IncludeRewriter::OutputContentUpTo(unsigned WriteEnd,
>>>                                        unsigned& NextToWrite,
>>>                                        int& Lines, const char* EOL,
>>>                                        const MemoryBuffer* FromFile,
>>>                                        bool EnsureNewline) {
>>>
>>> Input parameters should come before output parameters.
>>
>> Not entirely possible (EnsureNewLine is an input parameter but has a
>> default argument) nor necessarily desirable (the begin/end range
>> described by "WriteFrom" and "WriteTo" makes sense together) & I don't
>> think we really have any policy (or consistency) on this in the
>> codebase - though I don't mind the idea as a very light guidance when
>> all else is equal.
>>
>>>  int Lines = 1; // current input file line number
>>>
>>> I'd call this just 'Line'.
>>
>> Renamed to 'Line'.
>>
>>>            // (clang sometimes optimizes and does not repeatedly include some
>>>            // files even though it should, so all includes need to be
>>>            // commented, otherwise valid directives would be left in)
>>>
>>> To be more precise: "Clang skips repeated includes of headers which
>>> have a multiple-inclusion guard macro, so..."
>>
>> Dropped the comment here, though the behavior is described elsewhere
>> (see the doxycomment for InclusionRewriter::FileSkipped, for example),
>> as this seems fairly self explanatory now. Added a little bit of this
>> comment down
>>
>>>      RawLex.setParsingPreprocessorDirective(false);
>>>      RawLex.SetCommentRetentionState(false);
>>>
>>> Why do we need to re-disable the comment retention state here?
>>
>> No reason that I could divine. Removed.
>>
>>> +void RewriteIncludesInInput(Preprocessor &PP, raw_ostream *OS, const
>>> PreprocessorOutputOptions &Opts);
>>>
>>> 80 columns
>>
>> Fixed.
>>
>>> +class Preprocessor::ResetMacroExpansionHelper
>>> +{
>>>
>>> Brace should be on the same line as the class name
>>
>> Fixed.
>>
>>> +  RewriteIncludesInInput(CI.getPreprocessor(), OS,
>>> CI.getPreprocessorOutputOpts());
>>>
>>> 80 columns
>>
>> Fixed.
>>
>>> +// RUN: %clang_cc1 -verify -rewrite-includes -DFIRST -I %S/Inputs %s -o %t
>>> +// RUN: cat %t | FileCheck -strict-whitespace %s
>>>
>>> No need to involve a tempfile here, just pass '-' for the clang output
>>> file and pipe directly to FileCheck. (And don't worry about making it
>>> fit in 80 columns.)
>>
>> Done.
>>
>>> +// CHECK: {{^}}// STARTCOMPARE{{$}}
>>> +// CHECK-NEXT: {{^}}#define A(a,b) a ## b{{$}}
>>>
>>> It might be clearer and more maintainable to interleave these
>>> CHECK-NEXTs with the actual input above. I don't feel strongly about
>>> this, though.
>>
>> Interleaving will get a bit confusing given the current way this test
>> is written to use CHECK-NEXTs - since the comments containing the
>> checks will show up in the preprocessed output & thus get in the way
>> of the checks.
>> I'm wondering whether it'd just be clearer to do an exact match
>> against a separate file, really...
>>
>>> Here's another test case (to be #include'd twice):
>>> #ifndef REWRITE_INCLUDES_7
>>> #define REWRITE_INCLUDES_7
>>> included_line7
>>> #endif
>>
>> Added.
>>
>> & I also renamed the IncludeRewriter to InclusionRewriter (since the
>> former could still be confused as a verb phrase) and renamed the
>> source file it was in to match (lib/Rewrite/InclusionRewriter.cpp)
>>
>> Thanks,
>> - David




More information about the cfe-commits mailing list