[PATCH] D37491: [Preamble] Fixed preamble breaking with BOM presence (and particularly, fluctuating BOM presence)

Ilya Biryukov via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 6 02:01:43 PDT 2017


ilya-biryukov removed a reviewer: cfe-commits.
ilya-biryukov added subscribers: cfe-commits, klimek, bkramer.
ilya-biryukov added a comment.

- How are various preprocessor offests (and `SourceLocation` offsets) are calculated? Do they account for `BOM` presence and ignore it?
- Are there potential problems we may run into because of the changing offsets? Could we add tests checking changing the offsets does not matter?
- Should we add checks that `BOM` was removed or added, but not changed? I would not expect preamble to be reusable "as is" if `BOM` (and therefore, input encoding) changed.



================
Comment at: include/clang/Frontend/PrecompiledPreamble.h:102
+  /// appears or disappears between parses).
+  void UpdateOffset(unsigned NewOffset);
+
----------------
Let's leave this class's interface immutable. It is used concurrently in clangd and having a mutable method like this would break the code.

Passing new `PreambleBounds` to `AddImplicitPreamble` and setting the offsets accordingly would do the trick, leave the interface immutable and make the fact that offsets might change more evident.


================
Comment at: include/clang/Frontend/PrecompiledPreamble.h:191
   bool PreambleEndsAtStartOfLine;
+  /// The start offset of the bounds used to build the preamble.
+  unsigned PreambleOffset;
----------------
Let's store original `PreambleBounds` instead of `PreambleEndsAtStartOfLine` and `PreambleOffset`.
It would make the code easier to read.


================
Comment at: include/clang/Lex/Lexer.h:50
+  /// \brief Start offset of the preamble, in bytes.
+  unsigned StartOffset;
+  /// \brief Size of the preamble in bytes.
----------------
Maybe pick a name that clearly states that it's a `BOM` size? 
Or add a comment indicating that it's a `BOM` offset.


================
Comment at: include/clang/Lex/Lexer.h:639
 
-  void SkipBytes(unsigned Bytes, bool StartOfLine);
+  void SetByteOffset(unsigned Offset, bool StartOfLine);
 
----------------
Maybe leave the old name? Doesn't `SkipBytes` captures the new semantics just as good?


================
Comment at: lib/Frontend/PrecompiledPreamble.cpp:195
 
 PreambleBounds clang::ComputePreambleBounds(const LangOptions &LangOpts,
                                             llvm::MemoryBuffer *Buffer,
----------------
Could you inline usages of this function and remove it?



================
Comment at: unittests/Frontend/PchPreambleTest.cpp:190
+
+  ASSERT_TRUE(ReparseAST(AST));
+  ASSERT_FALSE(AST->getDiagnostics().hasErrorOccurred());
----------------
We're not really testing that preamble was reused.
Maybe return a flag from `ASTUnit::Reparse` to indicate if preamble was reused and check it here?


https://reviews.llvm.org/D37491





More information about the cfe-commits mailing list