[llvm-dev] UB in MemoryBufferMMapFile

Chris Lattner via llvm-dev llvm-dev at lists.llvm.org
Thu Nov 17 21:29:46 PST 2016


On Nov 17, 2016, at 9:13 AM, Justin Bogner <mail at justinbogner.com> wrote:
>>> This is UB, since we're reading one past an allocated region, and I have
>>> an internal test case where I'm seeing it crash because that memory
>>> happens to be unreadable.
>> 
>> Well that is wrong.  It should mmap the file, and if it knows that the
>> file is an exact multiple of a page size, it should copy it into a
>> malloc’d buffer that is one byte larger, and put a zero there.
> 
> This doesn't seem to be working correctly, unless
> sys::Process::getPageSize() is returning the wrong thing for me.
> 
> I'm getting here from getOpenFileImpl() in MemoryBuffer.cpp, which is
> deciding it's safe to mmap (by calling shouldUseMMap, where the page
> size check you mention is). The file is 561152 bytes, and the page
> size reports 16384, so this is not a multiple of page size (0x89000 &
> 0x3fff == 0x1000). Yet, the memory one byte past the mmap is not
> readable.

That sounds like sys::Process::getPageSize() is broken or that there is a kernel bug.  mmap is defined in terms of page granularity.

> 
>>> There are three options here:
>>> 
>>> 1. Remove the assert - it isn't possible to write correctly.
>> 
>> This will break clang.  Clang “knows” that files end with a nul byte.
> 
> Strictly speaking, it won't "break" clang, it will just make a certain
> class of bugs harder to track down.

Clang’s lexer performance depends on this.  I agree it won’t break the functionality of clang, but regressing compile time perf isn’t acceptable either.

-Chris


More information about the llvm-dev mailing list