<div style="font-family: arial, helvetica, sans-serif; font-size: 10pt">On Fri, Nov 30, 2012 at 8:18 PM, Douglas Gregor <span dir="ltr"><<a href="mailto:dgregor@apple.com" target="_blank">dgregor@apple.com</a>></span> wrote:<br>

<div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

On Nov 29, 2012, at 1:16 PM, Daniel Jasper <<a href="mailto:djasper@google.com">djasper@google.com</a>> wrote:<br>

<br>

><br>

>  Ping.<br>

><br>

>  Also, we are continuing the development on <a href="https://github.com/djasper/clang/tree/format" target="_blank">https://github.com/djasper/clang/tree/format</a>, but I am currently not updating this patch in fear of prolonging the review even more. It has reached a stage where it starts improving my workflow thanks to a little vim integration I have hacked together ...<br>


<br>

</div>A few questions and comments on the patch, but IMO I'd rather see this go into the main Clang tree soon and get hacked on there, since it looks like we're moving in the right direction and it's obviously functionality we want in the core.<br>


<br>

+/// \brief A character range of source code.<br>

+struct CodeRange {<br>

+  CodeRange(unsigned Offset, unsigned Length)<br>

+    : Offset(Offset), Length(Length) {}<br>

+<br>

+  unsigned Offset;<br>

+  unsigned Length;<br>

+};<br>

<br>

Why isn't this a SourceRange or CharSourceRange?<br></blockquote><div><br></div><div>This would put the burden on the client to do the "Offset+Length -> SourceRange" translation. But you're right, in this case that might be the right trade-off...</div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

+/// \brief Reformats the given Ranges in the token stream coming out of \c Lex.<br>

+///<br>

+/// Ranges are extended to include full unwrapped lines.<br>

+/// TODO(alexfh): Document what unwrapped lines are.<br>

+tooling::Replacements reformat(const FormatStyle &Style, Lexer &Lex,<br>

+                               SourceManager &SourceMgr,<br>

+                               std::vector<CodeRange> Ranges);<br>

<br>

As the main entry point, it seems that this really needs a more extensive comment. In general, please provide comments for the various new classes (FormatStyle, Formatter, UnwrappedLineFormatter, etc.), so people can navigate the code.<br>


<br>

+  bool isIfForOrWhile(Token Tok) {<br>

+    if (Tok.getKind() != tok::raw_identifier)<br>

+      return false;<br>

+    StringRef Data(SourceMgr.getCharacterData(Tok.getLocation()),<br>

+                   Tok.getLength());<br>

+    return Data == "for" || Data == "while" || Data == "if";<br>

+  }<br>

<br>

This is one of many places where we're doing string comparisons against keywords, which seems needlessly inefficient. How about using an IdentifierTable (with keywords filled in) to resolve identifiers to keywords, so you can actually check the tokens? That'll also be smarter w.r.t. identifiers that are keywords in some dialects but not others.<br>


<br>

I wish there were some way to poke at this feature as a user to see, e.g., how it would reformat some given block of code. It would help people with an interest in the formatting work to see what works, what doesn't, etc.<br>

</blockquote><div><br></div><div>Once the first version is checked in, we'll next check a tool into clang-extra-tools plus some nice vi integration with which you can run it over code snippets (or full files). Daniel actually has built that, and the fact that this is already surprisingly useful has accelerated development here somewhat :)</div>

<div><br></div><div>Cheers,</div><div>/Manuel</div></div></div></div>