[PATCH] D138861: [Clang] Implement CWG2640 Allow more characters in an n-char sequence

Corentin Jabot via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Dec 9 02:41:55 PST 2022


cor3ntin added a comment.

Thanks Tom, I replied to your comments



================
Comment at: clang/lib/Lex/Lexer.cpp:3386
+
+  if (!Result || CurPtr - StartPtr == (ptrdiff_t)(Buffer.size() + 4))
     StartPtr = CurPtr;
----------------
cor3ntin wrote:
> tahonermann wrote:
> > This is a bit of a tangent, but under what circumstances would `CurPtr - StartPtr` not be equal to `Buffer.size() + 4`? Actually, I'm not sure that +4 is correct. It looks like `StartPtr` is expected to point to `N` at the beginning of the function, so the initial `\` is not included in the range. If `N` isn't seen, the function returns early. Likewise, if either of the `{` or `}` delimiters is not found, the function returns early.
> > 
> > I think the goal of this code is to skip the `getAndAdvanceChar()` machinery when the buffer is being claimed (no need to parse UCNs or trigraphs in the character name), but it looks like, particularly with this DR, that we might always be able to do that.
> afaict, 4 is correct here because we are one past-the-end.
> I do however agree that this whole pattern which is used a few times is probably unnecessary, i do think it would be good to investigate. Not in this patch though, imo
I looked into it, I'm sure we could improve but not easily, `getAndAdvanceChar` does set some flags on the token in the presence of trigraphs/line splicing and we need those flags to be set, this is the reason we do need to call that method.
It's not super efficient but it's such an edge case... I'd rather not touch that now


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138861/new/

https://reviews.llvm.org/D138861



More information about the cfe-commits mailing list