[PATCH] D13765: clang-format: [JS] Handle string literals spanning character classes.

Sat Oct 17 22:10:46 PDT 2015

mprobst added inline comments.

================
Comment at: lib/Format/Format.cpp:860
@@ +859,3 @@
+    }
+    if (Prev) {
+      if (Prev->isOneOf(tok::plusplus, tok::minusminus)) {
----------------
djasper wrote:
> I understand and I actually find that hard to understand. I think a better solution might be to pull out a function for this logic. So that you just call:
> 
>   if (!canPrecedeRegexLiteral(Prev))
>     return;
> 
> and then
> 
>   bool canPrecedeRegeLiteral(const FormatToken *Prev) {
>     if (!Prev)
>       return true;
>     ..
>   }
> 
I see, that's indeed cleaner. Done.

================
Comment at: unittests/Format/FormatTestJS.cpp:609
@@ -603,1 +608,3 @@
+  // directly following a closing parenthesis.
+  verifyFormat("if (foo) / bar /.exec(baz);");
 }
----------------
djasper wrote:
> This case we could actually fix, right? Because we know that the ")" belongs to the if and if the following token isn't a comment or an open brace, it starts a new statement. Probably shouldn't be done in the same patch, but maybe add a FIXME.
Not sure how we'd do that in the general case.

Regular expression recognition needs to happen in the lexing phase (otherwise it'd greatly complicate everything later on). And in the lexing phase, it's rather complicated to track if the code is in an if statement, as of course the if condition can contain arbitrary code.

Normally the fix for this is having the parser call down into the lexer with the contextual information that at this location a slash introduces a regex token, but we cannot do that, as we are neither parsing here, nor do we have precise grammatical structure around.


http://reviews.llvm.org/D13765