<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Nov 1, 2013 at 1:15 PM, Daniel Jasper <span dir="ltr"><<a href="mailto:djasper@google.com" target="_blank">djasper@google.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div style="font-family:arial,sans-serif;font-size:13px">
The context-free parser implemented in clang-format is capable of (or can easily be extended to) understand the structure of basically all C-like languages. A basic definition of C-like languages is the “Influenced” section of C’s Wikipedia page [1].</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">At most, this includes: AMPL, AWK, csh, C++, C--, C#, Objective-C, BitC, D, Go, Rust, Java, JavaScript, Limbo, LPC, Perl, PHP, Pike, Processing, Seed7, Verilog. Today, clang-format already supports C, C++, Objective-C and Objective-C++. Starting from there, it seems almost trivial to extend support to JavaScript and Java which only contain a small number of additional syntactical constructs and can be tokenized with Clang’s lexer. Eventually, we also imagine (and would love to see patches for) formatting C#, D, Go, Rust, and PHP based on their similar syntax and active usage. Maybe there are others the community would be interested in seeing?</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">The benefit of using the same format tool for all these jobs is that many of clang-format’s advanced formatting algorithms (e.g. Tex-like analysis of the entire solution space) are immediately available to the other languages.</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">Concrete proposal:</div><div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">
Start with JavaScript. Syntactically, it is very close to C++ and there are already different efforts going on to combine JavaScript and LLVM. Add an additional LanguageStandard (this flag already supports C++03 and C++11) in clang-format’s configuration and gate JavaScript specific formatting decisions (e.g. indentation of JavaScript’s namespace-equivalent on it).</div>
</div></blockquote><div><br></div><div>JavaScript's "namespace equivalent" is just an anonymous function, so I'm not sure how you intend to detect this lexically.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">To make clang-format more useful to the LLVM project itself, support for the tblgen language seems another worthy goal that can be achieved in the same way as JavaScript.</div>
</div></blockquote><div><br></div><div>I've been really wanting something like clang-format for tablegen. By happy coincidence AFAIK the only situation where tablegen is lexically different from C++ (excluding keyword differences) is a special form of string literal that contains only C++ code, and the delimiters for the string literal are `[{` and `}]` which will be lexed and can be recognized, and then clang-format could recursively format the C++ code.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">Thoughts? Comments? Concerns with this direction?</div></div></blockquote><div><br></div><div>If preliminary experiments show that it is feasible to work with Clang's lexer and lexically different languages, then I think this probably makes sense. It may just be easier (not to mention more correct) to be able to plug in different lexers though.</div>
<div><br></div><div>-- Sean Silva<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">
<div style="font-family:arial,sans-serif;font-size:13px">
<br></div><div style="font-family:arial,sans-serif;font-size:13px">If this is in line with LLVM’s/Clangs’s roadmap, we'll start working on the few features missing for formatting JavaScript and we should have rudimentary support towards the end of the year.</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">[1] <a href="http://en.wikipedia.org/wiki/C_(programming_language)" target="_blank">http://en.wikipedia.org/wiki/C_(programming_language)</a></div>
</div>
<br>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>
<br></blockquote></div><br></div></div>