[cfe-commits] [PATCH] -rewrite-includes

Mon Mar 26 18:45:11 PDT 2012

On Fri, Mar 23, 2012 at 02:08, Lubos Lunak <l.lunak at suse.cz> wrote:
> On Friday 23 of March 2012, Matt Beaumont-Gay wrote:
>> A few general comments:
>
>> - No extra whitespace inside delimiters -- "if (foo)", not "if ( foo
>> )", and similarly for [] and <>.
>> - Get to know the naming rules in the style guide, and adjust
>> capitalization accordingly:
>> http://llvm.org/docs/CodingStandards.html#ll_naming
>
>  The thing is, I do know about that page and I tried to follow it, so
> apparently I can't do it on my own (and the fact that the codebase is
> inconsistent on this anyway and I find this style poorly readable is not
> helping).

>From the doc: "There are some conventions that are not uniformly
followed in the code base (e.g. the naming convention). This is
because they are relatively new, and a lot of code was written before
they were put in place." Sad but true.

Disregarding the state of any old code, newly added code should be
self-consistent.

> What is the command to automatically format the file to fit your
> requirements?

Some day we'll have such a thing... we actually have people working on
the infrastructure for such tools (see "[PATCH] Implements support to
run standalone tools" on cfe-commits).

>
>>   std::list< FileChange > FileChanges;
>>
>> Prefer std::vector -- no reason to use std::list here.
>
>  I've meanwhile confirmed that it should be actually std::map for noticeably
> better performance.

Does a DenseMap (#include "llvm/ADT/DenseMap.h") of SourceLocation to
pair<FileID, CharacteristicKind> improve performance any further?

>> - Lines should be shorter than 80 columns.
>
>>   for (std::list< FileChange >::const_iterator it
>>        = FileChanges.begin(), end = FileChanges.end();
>>
>> Funny line break here -- prefer breaking after the comma.
>
>  You need to pick your poison. C++ code generally does not fit into 80 columns
> and look nice at the same time.

This case happened to fit into 80 columns with the change I suggested.
It's a fairly common pattern in Clang and LLVM, e.g. (from a file I
happened to be looking at):
  for (SourceManager::fileinfo_iterator I = SM.fileinfo_begin(),
       E = SM.fileinfo_end(); I != E; ++I) {

>
>>   OutputContentsUpTo(SM.getFileOffset(DirectiveToken.getLocation())
>>     + DirectiveToken.getLength(), nextToWrite, FromFile);
>>
>> If you have to break a line on one side of a binary operator, prefer
>> to break after the operator rather than before.
>
>  In addition to the above, why should I do that? The operator makes it obvious
> why the previous line continues there.

The predominant style in Clang is to leave the operator on the
previous line. Reasonable people can disagree about which style is
more readable, but consistency trumps personal preference.

>
>> // It might be beneficial to have a switch that will remove contents of
>> lines // that are irrevelant for compile, i.e. comments. Comments are
>> actually // the majority of the resulting file and cleaning up such lines
>> would // significantly reduce the size of the resulting file without having
>> any // effect on any following usage (with the exception of human
>> inspection).
>>
>> The second sentence here is speculative. Remove the comment entirely,
>> or keep the first sentence with "FIXME:" at the front.
>
>  It is not speculative. There's also nothing broken to fix, it is a suggestion
> for an alternative mode that would have its advantages and disadvantages.

I'm sorry, I should have been more clear what I meant by
"speculative." There is a lot of code where the statement "Comments
are actually the majority of the resulting file" is not true, and
comments in Clang shouldn't suggest otherwise.

I also did not mean to imply that anything is broken here. "FIXME" is
a common shorthand for "Here is an idea for future work." Substitute
"TODO" if you'd prefer.

>
>>             // clang sometimes optimizes and does not repeatedly include
>> some // files even though it should,
>>
>> Please elaborate?
>
>  Clang does not even bother to perform repeated inclusion of the same file if
> it somehow detects it's unnecessary, but the avoided FileChanged() callback
> would mean -rewrite-includes would otherwise leave some #include directives
> in the result that would be performed the next time.

Can you use the FileSkipped() callback to make this any more efficient?

>
>>   // It would be even faster if the preprocessor could be switched to a
>> mode // where it would parse only preprocessor directives and comments,
>> nothing // else matters for parsing or processing.
>>
>> Technically true, I'm sure, but have you profiled it and found that it
>> would actually matter?
>
>  I have done some profiling, yes, and lexing is the majority of time. And
> scanning only for / and \n simply has to be faster. But that is a possible
> improvement for the future and I will not work on this now

OK, leave a TODO.

A few new comments, in addition to the general style issues which are
still unaddressed (doxygen, whitespace, etc):

std::map< unsigned, FileChange >

Might be worth introducing a typedef for this.

    if (!Extra.empty())
      OS.write(Extra.data(), Extra.size());

You can stream a StringRef into a raw_ostream.

void IncludeRewriter::InclusionDirective(SourceLocation HashLoc,
                                         const Token&,
                                         StringRef,
                                         bool,

Add '/* IsAngled */' so it's more clear why we're ignoring this parameter.

  if (pos + 1 < FromFile->getBufferEnd() && *pos == '\r')
    return "\n\r";

ITYM *(pos + 1) == '\r'? Similarly in the next condition.

  WriteLineInfo( FileName, 1, type, EOL, StringRef(" 1", 2 ));

StringRef has an implicit constructor from const char*.