[PATCH] D41423: [Lex] Avoid out-of-bounds dereference in LexAngledStringLiteral.

Volodymyr Sapsai via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 10 14:57:34 PST 2018


vsapsai marked an inline comment as done.
vsapsai added inline comments.


================
Comment at: clang/lib/Lex/Lexer.cpp:2014-2015
+    // getAndAdvanceChar.
+    if (C == '\\')
+      C = getAndAdvanceChar(CurPtr, Result);
+
----------------
dexonsmith wrote:
> If `CurPtr` is already equal to `BufferEnd`, why is it safe to call `getAndAdvanceChar`?  Is `BufferEnd` dereferenceable?
`BufferEnd` is still dereferancable but we do rely on it storing null character. `CurPtr < BufferEnd` check was added in https://reviews.llvm.org/D9489 to handle `#include <\` (no new line in the end). Before that change the buffer overflow was happening because after `\` we read null character at line 2015 (`CurPtr` becomes `BufferEnd+1`) and then one more character at line 2026. `CurPtr < BufferEnd` makes sure we can read 2 bytes before starting this 2-character sequence. It works fine when each character is 1 byte but fails when it is more. In this case backslash and new line are read as 1 character and `CurPtr < BufferEnd` check is insufficient.

In my fix I read the character after backslash and then decide if can read the next one, so it doesn't matter how many bytes are in this character.


================
Comment at: clang/lib/Lex/Lexer.cpp:2026
+
+    if (C == 0) {
       NulCharacter = CurPtr-1;
----------------
dexonsmith wrote:
> Should this check still be skipped (in an `else if` of the `C == '\\'` check)?
I agree it is behaviour change. `NulCharacter` is used to warn if you have null character in the string and I think it is worth warning even if it is preceded by the backslash. Therefore I think this check shouldn't be skipped after `C == '\\'` check. In practice I don't know how you can create a file with null character in its name and use it in inclusion directive.


================
Comment at: clang/unittests/Lex/LexerTest.cpp:477
 TEST_F(LexerTest, AvoidPastEndOfStringDereference) {
-  std::vector<Token> LexedTokens = Lex("  //  \\\n");
-  EXPECT_TRUE(LexedTokens.empty());
+  EXPECT_TRUE(Lex("  //  \\\n").empty());
+  // rdar://problem/35572754
----------------
dexonsmith wrote:
> To minimize the diff, please separate this change into an NFC commit ahead of time.
I plan to nominate this fix for inclusion in 6.0.0 release and having one commit will be easier. It is not necessary to include NFC change in 6.0.0 release but it creates discrepancy that increases a chance of merge conflicts.

But I don't have strong opinion, just pointing out potential downsides with merging the change to other branches.


https://reviews.llvm.org/D41423





More information about the cfe-commits mailing list