[llvm-dev] [pre-RFC] Data races in concurrent ThinLTO processes

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 26 20:54:15 PDT 2018


Le jeu. 22 mars 2018 à 16:46, Steven Wu <stevenwu at apple.com> a écrit :

> Hi Katya
>
> Thanks for investigating this. Here is my thought inline.
>
> On Mar 22, 2018, at 1:32 AM, katya.romanova at sony.com wrote:
>
>
> Hello,
>
> I am sending the following proposal to discuss issues and solutions
> regarding data races in concurrent ThinLTO processes.
>
> This caught my attention when we encountered a race condition in ThinLTO
> with caching.
> While looking into how ThinLTO deals with the problem of cache files
> reads/writes/deletes I spotted a lot of problems: some of them are related
> to data races, others - to file/resource sharing between different
> processes. I wanted to point out these problems and start the discussion
> about potential solutions. I would like to get your feedback.
>
>
> *Problem #1: In certain common situations, ‘rename’ can fail, causing a
> data race later on.*
> Look at the following code in the function ‘write’ in
> ThinLTOCodeGenerator.cpp
>
> *std::error_code EC =*
> *  sys::fs::createTemporaryFile("Thin", "tmp.o", TempFD, TempFilename);*
> *if (EC) {*
> *  errs() << "Error: " << EC.message() << "\n";*
> *  report_fatal_error("ThinLTO: Can't get a temporary file");*
> *}*
> *{*
> *  raw_fd_ostream OS(TempFD, /* ShouldClose */ true);*
> *  OS << OutputBuffer.getBuffer();*
> *}*
> *// Rename to final destination (hopefully race condition won't matter
> here)*
> *EC = sys::fs::rename(TempFilename, EntryPath);    *
>
> Compared to a race-prone direct write of a buffer to a file, this code
> avoids a data race by first writing a buffer to a temp file and then
> renaming that temp file to become the final file in the cache. After
> r315079, when ‘rename’ became more POSIX-compliant, this scheme guarantees
> atomicity of writing a file to the cache,
> (except on Wine, where (in some cases) non-atomic ::MoveFileExW is used).
>
> However, ‘rename’ might fail and return an error code in several different
> situations. If this happens, we will fall back to a direct write to the
> file, which is neither atomic, nor avoids race conditions (causing problem
> #3).
> An example where 'rename' can fail on both Windows and Unix-like systems,
> causing us to fall back to using non-atomic write, is problem #2.
>
> *Solution:*
> All solutions for problem #3 (see below) will take care of problem #1.
>
>
> I don't think there is a race here but it can probably be optimized. I
> don't see LTOCodeGenerator fall back to non-atomic write in the code. If
> the rename failed, it will just fatal_error and exit. Maybe a better
> solution is just leave the temp file and use it as output. The file will be
> recompiled if needed again (turns into a cache miss).
>
>
>
> *Problem #2 (Use of POSIX-compliant ‘rename’ function is not always
> suitable for ThinLTO)*
> We just encountered this issue in Sony. For ThinLTO, we use the function
> ‘rename’, which after r315079 doesn’t support renaming across the different
> logical volumes on Windows.
> Let’s say a user has a %TEMP% directory on volume C:\, but he/she passed
> the caching directory name located on volume D:\, then ‘rename’ fails.
> Since we don’t support renaming across different volumes after r315079, we
> then fall back to writing to the cache file directly (which is not atomic)
> and hit a data race.
> We anticipate that this will be a common situation for our users, many of
> whom are Windows “power users” and have multiple disks for different
> purposes.
>
> I think this problem doesn’t only affect Windows. The same issue might
> happen on Linux/MacOS, if the user's $TEMP/$TMP directory is located in a
> different file system than the user-specified caching directory.
>
> *Solution #1 (not so good):*
> The patch below solves this problem on Windows, however, we will lose the
> advantage that ‘rename’ gained in r315079 (i.e., its atomicity), which is
> not good.
>
> This patch assumes that the function 'rename' in Windows/Path.inc is
> expected to work when the source and destination are located of different
> logical volumes or different file systems. Note that function ‘rename’ is
> not ThinLTO-specific and might have some other consumers who want this
> behavior (which was the behavior before r315079). However, it seems that
> this assumption has changed later to make ‘rename’ more POSIX-compliant. In
> this case, we should add a comment to the function 'rename' so that its
> users knew that renaming across different volumes is not supported by
> design. Or we could have two different functions, one POSIX-compliant, not
> allowing renames across different volumes, another non-POSIX-compliant.
>
> Before r315079, the function 'rename', (in the case when the destination
> file isn't opened), boiled down to calling the function 'MoveFileExW' with
> the MOVEFILE_COPY_ALLOWED flag set, allowing renaming files across the
> volumes.
> The current implementation of the function ‘rename’ (in
> 'rename_internal'), essentially calls 'SetFileInformationByHandle'. This
> function is fast and atomic, so in a sense, it's an improvement over the
> previously used 'MoveFileExW'. However, 'SetFileInformationByHandle'
> doesn't work when the source and destination are located on the different
> volumes.
>
> My fix just falls back to calling 'MoveFileExW' when
> 'SetFileInformationByHandle' returns an error 'ERROR_NOT_SAME_DEVICE'.
>
>
> *+    // Wine doesn't support SetFileInformationByHandle in
> rename_internal.*
> *+    // Rename_internal doesn't work accross different disks. In both of
> these*
> *+    // cases fall back to using MoveFileEx.*
> *     if (EC ==*
> *-        std::error_code(ERROR_CALL_NOT_IMPLEMENTED,
> std::system_category())) {*
> *-      // Wine doesn't support SetFileInformationByHandle in
> rename_internal.*
> *-      // Fall back to MoveFileEx.*
> *+        std::error_code(ERROR_CALL_NOT_IMPLEMENTED,
> std::system_category()) ||*
> *+        EC == std::error_code(ERROR_NOT_SAME_DEVICE,
> std::system_category())) {*
> *       SmallVector<wchar_t, MAX_PATH> WideFrom;*
> *       if (std::error_code EC2 = realPathFromHandle(FromHandle,
> WideFrom))*
> *         return EC2;*
> *       if (::MoveFileExW(WideFrom.begin(), WideTo.begin(),*
> *-                        MOVEFILE_REPLACE_EXISTING))*
> *+                        MOVEFILE_COPY_ALLOWED |
> MOVEFILE_REPLACE_EXISTING))*
> *         return std::error_code();*
> *       return mapWindowsError(GetLastError());*
> *     }*
>
> Note, that when the source and destination are located on the different
> logical volumes, we will effectively perform a copy, followed by a delete,
> which are not atomic operations. Since copying to a different volume might
> be quite time consuming, we also increase the probability that some other
> process starts to rename to the same destination file.
>
> So, this fix for ‘rename’ is not good for ThinLTO (we should do something
> else in this case). ‘rename’ is a generic function and has many different
> users, but unless renaming across the volumes is needed by someone else,
> other than ThinLTO, this patch shouldn’t be accepted.
>
>
> *Solution #2 (better)*
> Here I only try to fix a ThinLTO issue, not a generic ‘rename’ issue as in
> solution #1:
>
> For ThinLTO+caching we don’t have to create temp files in %TEMP% or %TMP%
> directories (or $TMP/$TEMP). We can create them in the same location where
> the cached files are stored or in a ‘temp’ subfolder within this folder.
> With this approach:
> -  If the patch described in Solution #1 gets accepted (for the sake of
> other consumers), then from ThinLTO’s perspective ‘rename’ will stay atomic.
> -  If the patch doesn’t get accepted, then rename won’t fail, since we
> only rename the files within the same directory.
>
> *Solution #3*
> All the solutions for problem #3 (see below) will take care of problem #2.
>
>
> I am surprised that we are not doing #2 already. Rename across the volumes
> is not atomic thus it can cause issues.
>
>
>
> *Problem #3 (data race)*
> Look at the following code in ThinLTOCodeGenerator.cpp. This code gets
> executed if the ‘rename’ function failed (e.g., because of the problem #1
> or problem #2 described above).
>
> *EC = sys::fs::rename(TempFilename, EntryPath);*
> *if (EC) {*
> *  sys::fs::remove(TempFilename);*
> *  raw_fd_ostream OS(EntryPath, EC, sys::fs::F_None);*
> *  if (EC)*
> *    report_fatal_error(Twine("Failed to open ") + EntryPath +*
> *                       " to save cached entry\n");*
> *  OS << OutputBuffer.getBuffer();            **// Two ThinLTO processes
> can write to the same file here*
> *                                             // causing data race.*
>
> Here, we have a data race problem. We actually stumbled across it while
> testing one of the games with ThinLTO+caching.
>
> We need to find a generic solution to solve data race problem, while
> ensuring ‘atomicity’ of writing to cache files. I’m not a ThinLTO expert
> and I might not be aware of some ThinLTO constraints. Let me know which
> problems you see with the two solutions below. Hopefully, it will trigger a
> further discussion.
>
>
> *Solution #0 (it won’t require much modifications):*
> (a)   If ‘rename’ fails, do not write into the cache directly (we don’t
> want to have non-atomic writes!).
> (b)   If ‘rename’ fails, try to read from the cache.
> *auto ReloadedBufferOrErr = CacheEntry.tryLoadingBuffer();*
> *(c)*   If reading from the cache fails simply use the file that you just
> compiled directly (bypassing the cache entirely).
>
> This solution might cause cache misses (causing recompilations), but at
> least it should prevent data races when two ThinLTO processes write to the
> same file (which can produce invalid cache!). Correctness is more important
> than optimization. It’s worth noticing another shortage of this solution:
> unlike generic solution #1, #2 and #3 described below this particular
> solution won’t solve the problem #4.
>
> BTW, we should do something with WINE support in ‘rename’ (which is
> non-atomic). I didn’t want to list it as a separate problem here, but it is
> a problem. I could file a separate bug, if necessary.
>
>
> *Solution #1 (naïve generic solution):*
> (a) If the process wants to write to the cache file, it opens it with
> 'exclusive read/write' access.
> (b) If a process wants to write to the cache file that is ‘exclusively’
> opened by some other process, we could assume that the cache file will be
> successfully created by the first process and simply return from ‘write’
> function. Different processes writing to the cache file of the same name,
> are writing the same content, right?
> (c) If the process needs to read from the cache file, it might wait a bit
> while the file is open for 'exclusive write'. If within a certain period
> the cache file doesn’t get released by a writer or gets removed by a pruner
> – oh, well, we have a hit miss. After all, using cache is just an
> acceleration. Correctness is more important.
> (d) If we are concerned about data integrity of a cache file (e.g., a
> writer started to write to a file and something bad happened in the
> middle), we could do additional checks (I suspect that this might be
> already implemented in ThinLTO). A writer could add a unique hash at the
> end of the cache file or a CRC32 for each section of the file, which could
> be an indication that the write to this file was successful. A reader
> checks that this unique hash or a CRC32 checksum matches its expectations.
>
> I'm sure that there are good reasons why this "naive" solution wasn't
> implemented in the first place. If this solution is picked by LLVM
> community, I could write a working prototype for it that we could apply to
> ThinLTO.
>
>
> *Solution #2 (better generic solution):*
> It looks like cache file sharing between several ThinLTO processes is a
> classic readers-writers problem.
> https://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem
>
> We could use read-write locks for global inter-process file sharing.
> https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock
>
> ThinLTO's writer (creator/deleter) must acquire a write lock before
> writing to a file and release a write lock when it's done writing.
> ThinLTO’s user (reader) must acquire a read lock before reading a file and
> release a read lock when it's done reading. On a Unix-like system,
> read-write locks are part of POSIX-threads (pthreads). There is an
> implementation of Slim read-write locks on Windows, but unfortunately for
> in-process only (i.e., the locks can’t be shared among different
> processes), so it won’t work for us.
>
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa904937(v=vs.85).aspx
>
> However, on Windows, we could implement read-write locks ourselves, using
> a combination of a named mutex and a named semaphore (any unique global
> name could be used for creating a named mutex/semaphore, the simplest one
> will be the cache file name itself).
>
> Here is an example of an implementation:
>
> http://www.zachburlingame.com/2011/06/simple-reader-writer-lock-in-c-on-win32/
>
> If this solution is picked, I could write a working prototype for it that
> we could apply to ThinLTO.
>
> *Solution #3 (hybrid generic solution)*
> Basically, use solution #2 as a base and integrate some optimizations and
> completeness checking features from solution #1 (1.b and 1.d respectively).
>
> *Problem #4 (cache misses could have been avoided via synchronization).*
>
> With our current approach we might often race for the cache file resources
> and as a result have cache misses.
> Of course, it's not as dangerous as the data race problem #3, but it
> definitely lowers the efficiency of caching.
>
> (a) The reader checks that a file exists, but then the pruner deletes it
> before the reader had a chance to read it. Cache miss.
> (b) The reader starts reading a file (block by block) and at this time the
> pruner removes the file. The read might fail. This is OS-specific, but
> likely a cache miss.
> (c) If the writer will rename a temp file into a file that is currently
> being read by a reader, I don't expect anything good out of it (the
> behavior is OS-specific). In the best case scenario, the read will fail, in
> the worst one the reader will read the wrong content. So, it's a cache miss
> or a correctness issue.
> (d) This is not a very likely event, but what if two processes are trying
> to rename to the same file at the same time?
>
> *Solution:*
> Solutions #1, #2 and #3 for problem #3 will take care of problem #4.
>
>
> *(Potential) problem #5:*
> The implementation of the function ‘rename’ on Windows used by ThinLTO
> heavily depends on the function CreateFileW or, to be more exact, on its
> flag FILE_SHARE_DELETE being supported and working correctly on *all* the
> file systems that can be mounted on Windows. Is the FILE_SHARE_DELETE flag
> supported on all non-native Windows file systems that we care about? Is it
> supported on WINE? On FAT32?
>
> If the flag FILE_SHARE_DELETE is not supported, the ‘rename’ fails, and we
> have problem #3.
>
> *Solution:*
> All the solutions for problem #3 will take care of problem #5.
>
>
> Please let me know what do you think about the problems/solutions
> discussed here. Hopefully this pre-RFC shapes into a RFC soon.
>
>
> I am not really concerned with all the problems, except Problem #2. From
> my understanding, LTO cache is for incremental build only. For every file
> in the cache, there should be one single writer, otherwise, it is a hash
> collision and will result in miscompile (or link error most likely). The
> cached file will only be used when you rebuild the binary. Lots of data
> races are in situations that are not supported.
>
> For its current design, you cannot shared the LTO cache between two
> linking. For example, if you are building llc and libLTO from LLVM and they
> shared many same static archives with common LTO object files, you still
> need to use two different LTO caches.
>

Why do we need to use two different caches?
In my memory a ThinLTO build of LLVM does use a single cache and share and
reuse objects across binaries if possible.

Best,

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180327/07f4a97e/attachment.html>


More information about the llvm-dev mailing list