[PATCH] D38570: Support: Rewrite Windows implementation of sys::fs::rename to be more POSIXy.

Peter Collingbourne via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 5 10:32:15 PDT 2017


On Wed, Oct 4, 2017 at 7:35 PM, Zachary Turner <zturner at google.com> wrote:

> Ahh, i see what you’re saying. Dest *becomes* Source for a short time.
>
> The main thing I’m concerned about with this new implementation is that it
> increases the complexity quite a bit, and in doing so makes the semantics
> harder to reason about. How many additional weird codepaths in the kernel
> is this tickling now? So there could be more (or at least different) race
> conditions.
>

That was a concern for me as well, but it is somewhat mitigated by the fact
that it is essentially a reimplementation of ReplaceFile, which is already
part of the Windows API, so I'd expect the kernel code paths to be
relatively well exercised. From MSDN: "An application can call ReplaceFile
instead of calling separate functions to save the data to a new file,
rename the original file using a temporary name, rename the new file to
have the same name as the original file, and delete the original file."
i.e. what my code is doing.

Would it be possible for the ThinLTO processes to map a shared memory
> region and synchronize on it instead of relying on flaky file system
> semantics?
>

Not easily. The cache needs to be persisted between linker invocations, so
I think there would need to be something on the file system. I suppose that
you could imagine mmap'ing a large sparse file into every process and
synchronizing on that, but there is also a mechanism for pruning old cache
entries in order to save disk space (which may run concurrently with
another process that is adding entries to the cache), so even if we did
have an mmap'ed file, we'd need to arrange for the pruner to release pages
to free storage, and to do so without races.

As long as you’re relying on the filesystem i don’t think any amount of
> complexity will save you from every race condition.
>
> I’m not *completely* opposed here, but I’m a little concerned about the
> additional complexity
>

My view is that any alternatives to the file system would come with its own
complexity (which my gut says would be higher than using the file system
directly) and in the end would have to interact with the file system at
some level in order to persist data.

Peter

On Wed, Oct 4, 2017 at 7:23 PM Peter Collingbourne <peter at pcc.me.uk> wrote:
>
>> On Wed, Oct 4, 2017 at 7:01 PM, Zachary Turner <zturner at google.com>
>> wrote:
>>
>>> But the Source file is just a temporary file right that’s not (yet) part
>>> of the cache right? So why would another process be accessing the same
>>> temporary file instead of creating a different one?
>>>
>>
>> My point is that Source receives the name of Dest during the call to
>> SetFileInformationByHandle. So after SetFileInformationByHandle returns
>> there is a file with the name Dest that was opened (as Source) without
>> sharing, so concurrent accesses to Dest fail until the handle is closed.
>>
>> Is the Source file already part of the cache for some reason?
>>>
>>
>> No.
>>
>> Peter
>>
>>
>>> Btw, I have an (absolutely terrible) idea how to solve this “for real”,
>>> but I’ll save it until I understand the problem better
>>> On Wed, Oct 4, 2017 at 6:38 PM Peter Collingbourne via Phabricator <
>>> reviews at reviews.llvm.org> wrote:
>>>
>>>> pcc added a comment.
>>>>
>>>> In https://reviews.llvm.org/D38570#889021, @zturner wrote:
>>>>
>>>> > So if I understand the problem correctly, you don't have any issue
>>>> with `ReplaceFile`'s behavior on the destination, only the source.  So, for
>>>> example, you call `ReplaceFile`, it returns, and then some other process
>>>> tries to open the //source// file and fails.
>>>>
>>>>
>>>> Not quite. From the ThinLTO perspective I don't care about the
>>>> behaviour on the source file before it is renamed, because all source files
>>>> are temporary files whose names are unknown to other processes. The problem
>>>> is with the behaviour on the source file after it has been renamed to the
>>>> destination file. ThinLTO does this for each new cache file that it creates:
>>>>
>>>>   rename("thinlto-cache/temporary file name", "thinlto-cache/llvmcache-ABC123")
>>>> // ABC123 is a unique hash that is computed from the linker's input files
>>>> and command line options. There is a many-to-one mapping from hashes to
>>>> object files (modulo collisions, but that's unlikely enough not to matter).
>>>>
>>>> In parallel, another process will be doing this in order to try and
>>>> open an existing cache file:
>>>>
>>>>   open("thinlto-cache/llvmcache-ABC123") // This is expected to either
>>>> succeed (cache hit) or return "no such file or directory" (cache miss).
>>>>
>>>> Where this breaks is when `rename` is implemented in terms of
>>>> `ReplaceFile`. In pseudocode, here's what I think is going on in the
>>>> important part of `ReplaceFile`:
>>>>
>>>>   void ReplaceFile(char *Src, char *Dst) {
>>>>     HANDLE h = CreateFile(Src, no sharing); // (1) Src cannot be opened
>>>>     SetFileInformationByHandle(h, Rename, Dst); // (2) Now Dst cannot
>>>> be opened
>>>>     CloseFile(h); // (3) Now Dst can be opened
>>>>   }
>>>>
>>>> So if the `open` happens between 2 and 3, it will get a permission
>>>> denied error, which is not what the caller of `open` is expecting.
>>>>
>>>>
>>>> https://reviews.llvm.org/D38570
>>>>
>>>>
>>>>
>>>>
>>
>>
>> --
>> --
>> Peter
>>
>


-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171005/015d047d/attachment.html>


More information about the llvm-commits mailing list