[PATCH] D143142: [clang][lex] Enable Lexer to grow its buffer

David Rector via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 15 06:23:51 PST 2023


davrec added a comment.

Only had a chance to give it a once over, I will look through more closely later, def by this weekend.  Main thing is I think we shouldn't be exposing the buffer pointers after this change, i.e. no public function should return `const char *`, unless I'm missing something.  If that box is checked and performance cost is negligible I'd give this the thumbs up.



================
Comment at: clang/include/clang/Lex/Lexer.h:307
   /// Return the current location in the buffer.
-  const char *getBufferLocation() const { return BufferPtr; }
+  const char *getBufferLocation() const {
+    assert(BufferOffset <= BufferSize && "Invalid buffer state");
----------------
I think I'd like this to return `unsigned`; i.e. I think after this patch we should not even be publicly exposing buffer locations as pointers, IIUC.  A brief search for uses of `getBufferLocation()` (there aren't many) suggests this would be probably be fine and indeed would get rid of some unnecessary pointer arithmetic.  And indeed if anything really needs that `const char *` that might be a red flag to investigate further.


================
Comment at: clang/include/clang/Lex/Lexer.h:609
 
-  bool CheckUnicodeWhitespace(Token &Result, uint32_t C, const char *CurPtr);
+  bool CheckUnicodeWhitespace(Token &Result, uint32_t C, unsigned CurOffset);
 
----------------
FWIW it sucks that `uint32_t` is already sprinkled throughout the interface alongside `unsigned`, wish just one was used consistently, but that does not need to be addressed in this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143142/new/

https://reviews.llvm.org/D143142



More information about the cfe-commits mailing list