[cfe-dev] RFC: Extend clang-format to support more/all C-like languages

Sean Silva silvas at purdue.edu
Sat Nov 2 21:14:17 PDT 2013


On Sat, Nov 2, 2013 at 1:42 PM, Joshua Cranmer 🐧 <Pidgeot18 at gmail.com>wrote:

> On 11/1/2013 12:15 PM, Daniel Jasper wrote:
>
>> Start with JavaScript. Syntactically, it is very close to C++ and there
>> are already different efforts going on to combine JavaScript and LLVM. Add
>> an additional LanguageStandard (this flag already supports C++03 and C++11)
>> in clang-format’s configuration and gate JavaScript specific formatting
>> decisions (e.g. indentation of JavaScript’s namespace-equivalent on it).
>>
>
> JavaScript has some extremely different syntax that is likely to play
> havoc with a straight C/C++ lexer. It also depends on exactly which variant
> of JavaScript you want to support--ES5? ES6? Mozilla's JS extensions?
> Support E4X as well?
> 1. Regular expressions. I don't recall off the top of my head, but I
> believe it boils down to "/ starts a regular expression if you're expecting
> an operand and is a division operator if you're not"--you'll need to do at
> least enough parsing to distinguish those two cases.
>

For reference, the entire syntax is split down the middle between "/ means
division" vs "/ means regex". Spec reference is <http://es5.github.io/#x7>;
the two goal tokens of the grammar are InputElementDiv and
InputElementRegExp.


> 2. Array comprehensions (ES 6/Mozilla JS 1.8.5 enhancements): [x for (x in
> obj)], [x for each (x in obj)], [x for (x of obj)]. The middle is not in ES
> 6 (it's actually a holdover from E4X that sticks around because it was
> introduced well before the for-of statement was, and found relatively
> widespread use in Mozilla which made the JS people keep it around when we
> killed E4X), and I don't recall if the generator form (without enclosing
> brackets) is in ES 6 or not.
> 3. Generators: function*() { yield y; yield* x; }. I don't even know
> recommended style guides for the star-variant, as I only just retrofitted
> my code to have them two or three days ago.
> 4. Object literals:
> var x = {
>   get y() { return z; },
>   x: 13,
>   q: function () { return this.x; }
> };
>

This is the kind of feature that I'm least concerned with, since it is
lexically identical to C++ (modulo a few keywords that will be interpreted
as identifiers).


> 5. Semicolon insertion. Newlines can become semicolons in the right
> circumstances (the worst misfeature in JS, IMHO).
>

This isn't a big deal. Basically there's a fixed set of lookahead tokens
and if the next token is one of them, then the statement continues,
otherwise it ends ("semicolon is inserted"). (Of course, open parentheses
override this but I'm sure clang-format already handles that case just
fine).


> 6. Reading the ES6 draft, it supports \u in IdentifierNames just like Java
> does.
> 7. Some other operators are in play. === and !== have been brought up/
> there is also >>> and >>>= (like Java's operators), and ... (array spread
> operator).
> 8. Some ES6 features I haven't played with that may or may not have been
> added to some libraries yet: template strings, and there's also a =>-like
> notation for functions IIRC.
>
> Regular expression literal support will definitely need different lexing
> paths than C/C++, although (excluding template strings and some E4X
> literals--the former of which is too new to be widely supported and the
> latter of which has already been ripped out from the only major engine that
> it) I think it is otherwise close enough to reuse a lot of the lexing
> capabilities of C-family languages. Just be forewarned that the shallow
> parsing that needs to be done for JS is likely to be rather different from
> that down for C/C++, even if their lexing streams look more or less similar.


If this goes through, I would expect clang-format's lambda formatting to be
top-notch, since about half of all javascript code basically consists of
lambdas :)

-- Sean Silva


>
>
> --
> Joshua Cranmer
> Thunderbird and DXR developer
> Source code archæologist
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20131103/b2fe7db5/attachment.html>


More information about the cfe-dev mailing list