[cfe-commits] [PATCH] Initial version of formatting library

Douglas Gregor dgregor at apple.com
Fri Nov 30 11:18:31 PST 2012


On Nov 29, 2012, at 1:16 PM, Daniel Jasper <djasper at google.com> wrote:

> 
>  Ping.
> 
>  Also, we are continuing the development on https://github.com/djasper/clang/tree/format, but I am currently not updating this patch in fear of prolonging the review even more. It has reached a stage where it starts improving my workflow thanks to a little vim integration I have hacked together ...

A few questions and comments on the patch, but IMO I'd rather see this go into the main Clang tree soon and get hacked on there, since it looks like we're moving in the right direction and it's obviously functionality we want in the core.

+/// \brief A character range of source code.
+struct CodeRange {
+  CodeRange(unsigned Offset, unsigned Length)
+    : Offset(Offset), Length(Length) {}
+
+  unsigned Offset;
+  unsigned Length;
+};

Why isn't this a SourceRange or CharSourceRange?

+/// \brief Reformats the given Ranges in the token stream coming out of \c Lex.
+///
+/// Ranges are extended to include full unwrapped lines.
+/// TODO(alexfh): Document what unwrapped lines are.
+tooling::Replacements reformat(const FormatStyle &Style, Lexer &Lex,
+                               SourceManager &SourceMgr,
+                               std::vector<CodeRange> Ranges);

As the main entry point, it seems that this really needs a more extensive comment. In general, please provide comments for the various new classes (FormatStyle, Formatter, UnwrappedLineFormatter, etc.), so people can navigate the code.

+  bool isIfForOrWhile(Token Tok) {
+    if (Tok.getKind() != tok::raw_identifier)
+      return false;
+    StringRef Data(SourceMgr.getCharacterData(Tok.getLocation()),
+                   Tok.getLength());
+    return Data == "for" || Data == "while" || Data == "if";
+  }

This is one of many places where we're doing string comparisons against keywords, which seems needlessly inefficient. How about using an IdentifierTable (with keywords filled in) to resolve identifiers to keywords, so you can actually check the tokens? That'll also be smarter w.r.t. identifiers that are keywords in some dialects but not others.

I wish there were some way to poke at this feature as a user to see, e.g., how it would reformat some given block of code. It would help people with an interest in the formatting work to see what works, what doesn't, etc.

	- Doug



More information about the cfe-commits mailing list