[PATCH] D30010: Improve the robustness of mmap

Pavel Labath via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 17 04:22:39 PST 2017


labath added a comment.

In https://reviews.llvm.org/D30010#679201, @zturner wrote:

> In https://reviews.llvm.org/D30010#679198, @ruiu wrote:
>
> > I wonder what this patch is trying to fix on Unix machines. mmap()'ing a file on NFS is not a special thing, and if something goes wrong with NFS, a process will hang whether a file is being used through read(2)/write(2) or mmap. What am I missing?
>
>
> If you just `read()` the entire contents, in order for the call to fail the NFS would have to go down during the very short period of time where you are reading the entire file contents into memory.  If you `mmap` the contents and then keep the mapping open for, say several hours, then if the NFS has a hiccup at any point during those several hours, a call to read could crash.  Any time you are dealing with a network file system you of course might encounter problems, I think this just minimizes the likelihood of those problems occuring in practice.
>
> The motivation for this change came when I tried to replace LLDB's mmap code with LLVM's.  But I was told that checking if the file is local is a requirement, because they've had bugs in the past where they crash all of Xcode because they try to mmap some debug info off of a network share and then some time later during debugging, the share goes down and `read` crashes all of Xcode.


The thing here is that the failure modes for read() are much saner than for mmap()ed memory. If the NFS share goes down *after* the read(), nothing happens. If it goes down *during* the read(), the syscall will just return -1, and that's an error you can easily handle.

In the mmap case, if the NFS share goes down after you have mmap()ed the memory, that memory will be silently removed from your address space, and next time you try to access it you will get a SIGBUS. It is still possible to recover from this situation, but it is much more tricky.

That said, local files are also not immune to the SIGBUS issue - the same thing can happen if someone does an truncate(2) on the mmap()ed file, but I guess those kinds of issues are much more rare in normal circumstances.



================
Comment at: llvm/lib/Support/Unix/Path.inc:68
 #include <sys/types.h>
-#if !defined(__APPLE__) && !defined(__OpenBSD__) && !defined(__ANDROID__)
+#if !defined(__APPLE__) && !defined(__OpenBSD__) && !defined(__FreeBSD__) &&   \
+    !defined(__ANDROID__)
----------------
krytarowski wrote:
> zturner wrote:
> > krytarowski wrote:
> > > Is this equivalent to `defined(__linux__) || defined(__NetBSD__)`? If so it might look easier to parse presented this in the way of what system is used for this.
> > If you scroll down the file about 30 lines you'll see this:
> > 
> > ```
> > #if defined(__FreeBSD__) || defined (__NetBSD__) || defined(__Bitrig__) || \
> >     defined(__OpenBSD__) || defined(__minix) || defined(__FreeBSD_kernel__) || \
> >     defined(__linux__) || defined(__CYGWIN__) || defined(__DragonFly__) || \
> >     defined(_AIX)
> > ```
> > 
> > I don't know the ins and outs of all these definitions, but I'm pretty sure at least some of them will fall into this `#if` branch.  And there could even be other ones that are not listed here but which define other platforms.  So I'm not entirely sure what the inverse of this conditional is.
> Yes I saw this switch and I was thinking how many OS'es are really used there. Other than that NetBSD platform part looks fine. I will test build it.
I guess the proper solution here would be to do a check at cmake-time for the appropriate symbols, and then branch here based on that.


https://reviews.llvm.org/D30010





More information about the llvm-commits mailing list