<div class="gmail_quote">On Wed, Jun 6, 2012 at 3:08 AM, Manuel Klimek <span dir="ltr"><<a href="mailto:klimek@google.com" target="_blank">klimek@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote"><div class="im">On Wed, Jun 6, 2012 at 12:04 PM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote"><div>On Wed, Jun 6, 2012 at 2:57 AM, Manuel Klimek <span dir="ltr"><<a href="mailto:klimek@google.com" target="_blank">klimek@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>Daniel & me have worked on a first proposal to provide an interface for clang formatting as part of the Tooling library.</div></blockquote><div><br></div></div><div>Please re-send the patch as an attachment? Email plays havoc w/ 80-columns. </div>
</div></blockquote><div><br></div></div><div>Done.</div></div></blockquote><div><br></div><div>Thanks!</div><div><br></div><div>I think it would help a lot to have a bit more code & context when evaluating this, even as a very early step.</div>
<div><br></div><div>My concrete suggestion would be this: implement a bozo-formatter. A completely trivial to implement, BS set of rules that only requires lexing. Then we can see the interface, a trivial implementation, and some test cases that use it. I think that'll make this much easier to review and really analyze. It also makes it much closer to something that could get checked in as a stepping stone.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_quote"><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="gmail_quote"><div><div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Questions:</div><div>- please point out anything that might make this interface not viable for your own use cases</div>
<div>- are we missing the point somewhere?</div><div>- is Tooling/ the right place for this?</div></blockquote></div></div></div></blockquote></div></div></blockquote><div><br></div><div>I think Tooling is the right place because I think essentially every tool anyone writes will want to have *some* formatting applied to it. Leveraging this from an existing tool and composing cleanly w/ it will almost certainly be important.</div>
<div><br></div><div>A few specific comments that jump out at me:</div><div> </div><div>+// Various functions to configurably format source code.</div><div>+//</div><div>+// This supports two use cases:</div><div>+// - Format (ranges in) a given file.</div>
<div>+// - Reformat sources after automated refactoring.</div><div><br></div><div>I don't (yet) see how this is supported. I suspect my example request will fix...</div><div> </div><div>+//</div><div>+// Formatting is done on a per file basis.</div>
<div>+//</div><div>+// The formatting strategy is:</div><div>+// - lex the file content</div><div>+// - in case formatting decisions need information of the AST analysis, run</div><div>+// the analysis and annotate all tokens that are relevant for formatting</div>
<div> +// in the current file</div><div><br></div><div>How does this interact with the just-refactored use-case? Ideally, I would like to have the option of using the existing AST (when the refactor didn't change it in interesting ways, f.ex., it just renamed bits), or have it re-compute one.</div>
<div><br></div><div>I worry about requiring recomputing the AST after a refactor -- I may need to finish applying the refactor across multiple files before things start to compile again... Maybe the existing rewriter / refactoring APIs handle this though.</div>
<div> </div><div>+// - reformat based on the (annotated) tokens</div><div><br></div><div>Where are these tokens annotated? I don't see any real place for that in the interface below.</div><div> </div><div>+//</div><div>
+// To allow different strategies of handling the SourceManager the formatter </div><div>+// works on an interface AnalyzableSource, which provides access to the file</div><div>+// contents and, if available, allow to lazily run an analysis over the AST</div>
<div>+// when needed.</div><div><br></div><div>Should this be what provides access to the token stream?</div><div> </div><div>+/// \brief A character range of source code.</div><div>+struct CodeRange {</div><div><br></div>
<div>Why doesn't CharSourceRange work? This seems an odd entity to have stashed away inside of this library.</div><div><br></div><div>Why doesn't a pair of iterators into the buffer work?</div><div><br></div><div>
Hmm, below I see this is almost a user-input... Do you really want a pair of line+column coordinates?</div><div><br></div><div>+/// \brief Set of formatting options.</div><div>+struct FormatOptions {</div><div><br></div><div>
Perhaps I'm overly pessimistic, but I strongly expect to want most formatters to have their own custom *logic*, not just a set of options.</div><div><br></div><div>I had imagined more of a visitor system for formatting where there were methods that could be overridden for each formatting "event". These methods would be given access to the location, state from previous events, and be able to return information about line breaks, indentation, whatever was relevant for that event. They could also query for the current AST node, walk around the AST, navigate, etc.</div>
<div><br></div><div>These would then be supplemented by helper logic to handle common patterns and idioms. Those would form the basis of code reuse between formatting systems.</div><div><br></div><div>I'm fine not going completely crazy right off the bat, but I wonder whether it's worth trying to begin the design of a visitor-like logic-based interface for performing a particular format now... even if we end up with only one implementation and an options struct for a long time. (Honestly, I would expect all of them to take an options struct to supplement the logic anyways...)</div>
</div>