[llvm] [MC] AsmLexer invalid read fix. (PR #154972)

Szymon Piotr Milczek via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 15 11:05:43 PDT 2025


================
@@ -120,6 +120,11 @@ AsmLexer::AsmLexer(const MCAsmInfo &MAI) : MAI(MAI) {
 
 void AsmLexer::setBuffer(StringRef Buf, const char *ptr,
                          bool EndStatementAtEOF) {
+  // Null terminator must be part of the actual buffer. It must reside at
+  // `Buf.end()`. It must be safe to dereference `Buf.end()`.
+  assert(*Buf.end() == '\0' &&
+         "Buffer provided to AsmLexer lacks null terminator.");
+
----------------
smilczek wrote:

I spent some time playing around with buffer allocation in order to ensure buffer allocated in by llvm-mc doesn't allocate any additional memory (`WritableMemoryBuffer::getNewUninitMemBuffer` adds `BufAlign.value()` to `RealLen` instead of `BufAlign.value() - 1`, which means the buffer will always have at least 1 extra byte allocated)

After a while I finally realized that AsmLexer always expects that the null terminator of the buffer is present and it is NOT included in the buffer's length. That means that `CurBuf.end()` will always point to `\0` and `CurBuf.end()` is always valid memory.

Unfortunately I couldn't find this expectation documented anywhere.
It also doesn't seem to be enforced anywhere in AsmLexer unfortunately.
In contrast, it seems that AsmParser with LLLexer (not MCAsmParser) does enforce null terminator.

I could patch AsmLexer to support non-null-terminated buffers, however is that a change we desire?
Should AsmLexer also enforce null terminator?

https://github.com/llvm/llvm-project/pull/154972


More information about the llvm-commits mailing list