[llvm] [Support] mmap when possible in getSTDIN. (PR #162013)

via llvm-commits llvm-commits at lists.llvm.org
Sat Oct 11 09:57:55 PDT 2025


aokblast wrote:

> > Currently, some programs rely *c = '\0' to find the end of a buffer instead of the provided size. In my experiment, therefore, it is impossible for us to mmap in all cases since some file may not have '\0' in the end of the file and requires copy buffer and append '\0' manually. However, it would be a huge performace improvement if we use mmap when processing a huge file. Can I rely on the test-suite's results to make sure which program rely on *c = '\0' and therefore provide mmap acceleration for other programs?
> 
> Without knowing the context, it sounds to me like a tool that ignores the provided size and looks for a trailing null terminator has a bug in it (what if the input contains null bytes other than at the end of the data?), so the tool should be updated. What tools rely on this behaviour?

I fix the Windows issue on my local tree and actually discover a Subsystem relies on checking ending 0 without checking the size at the same time. For your simple reference, it is in [here](https://github.com/llvm/llvm-project/blob/main/llvm/lib/AsmParser/LLLexer.cpp#L178).

After fixing it in my local tree by moving the == end() check out of the switch statement. It fails on [https://github.com/llvm/llvm-project/blob/main/llvm/unittests/AsmParser/AsmParserTest.cpp#L43](this unittest), which I think it assume a AsmParser target should not be Non-null terminate and returns an error.

By using git-blame, it links to [this pr](https://reviews.llvm.org/D9883). But I am not really sure why he modify the MemoryBuffer instead of check the end of the buffer in LLLexer. I would suggest that we should just remove this Unit Test and make the option that enable check NullTermination on MemoryBuffer becomes false.

https://github.com/llvm/llvm-project/pull/162013


More information about the llvm-commits mailing list