[PATCH] D138861: [Clang] Implement CWG2640 Allow more characters in an n-char sequence

Corentin Jabot via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Dec 9 00:43:53 PST 2022


cor3ntin added a comment.

Thanks Tom, I replied to your comments



================
Comment at: clang/lib/Lex/Lexer.cpp:3378-3379
 
-  if (LooseMatch)
+  // If no diagnostic has been emitted yet we do not want to recover here
+  // to make sure this function will be called again and a diagnostic emitted then.
+  if (LooseMatch && Diagnose)
----------------
tahonermann wrote:
> I'm having a hard time understanding what this comment is trying to convey. The comment response to Aaron was much more helpful to me. How about this suggested edit?
It's not that a diagnostic would be emitted twice, is that it would not be emitted at all.
By recovering - ie returning a loose match - we prevent the function to be called again on the same buffer.
So if during the first call Diagnose is false (because we were called from a tryXXXX function), then the compilation will continue assuming we were in fact able to parse an identifier and never informs us of an invalid name.


================
Comment at: clang/lib/Lex/Lexer.cpp:3386
+
+  if (!Result || CurPtr - StartPtr == (ptrdiff_t)(Buffer.size() + 4))
     StartPtr = CurPtr;
----------------
tahonermann wrote:
> This is a bit of a tangent, but under what circumstances would `CurPtr - StartPtr` not be equal to `Buffer.size() + 4`? Actually, I'm not sure that +4 is correct. It looks like `StartPtr` is expected to point to `N` at the beginning of the function, so the initial `\` is not included in the range. If `N` isn't seen, the function returns early. Likewise, if either of the `{` or `}` delimiters is not found, the function returns early.
> 
> I think the goal of this code is to skip the `getAndAdvanceChar()` machinery when the buffer is being claimed (no need to parse UCNs or trigraphs in the character name), but it looks like, particularly with this DR, that we might always be able to do that.
afaict, 4 is correct here because we are one past-the-end.
I do however agree that this whole pattern which is used a few times is probably unnecessary, i do think it would be good to investigate. Not in this patch though, imo


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138861/new/

https://reviews.llvm.org/D138861



More information about the cfe-commits mailing list