[PATCH] D114966: [clang][deps] Split stat and file content caches

Thu Dec 16 08:45:46 PST 2021

jansvoboda11 added inline comments.

================
Comment at: clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp:200-210
+          // FIXME: This read can fail.
+          // In that case, we should probably update `CacheEntry::MaybeStat`.
+          // However, that is concurrently read at the start of this function
+          // (`needsUpdate` -> `isReadable`), leading to a data race. If we try
+          // to avoid the data race by returning an `error_code` here (without
+          // updating the entry stat value, we can end up attempting to read the
+          // file again in the future, which breaks the consistency guarantees
----------------
dexonsmith wrote:
> dexonsmith wrote:
> > I'm not quite following this logic. I think it's safe (and important!) to modify MaybeStat if `read()` fails.
> > 
> > We're in a critical section that either creates and partially initializes the entry, or incrementally updates it.
> > 
> > In the "creates and partially initializes" case:
> > - All other workers will get nullptr for `Cache.getEntry()` and try to enter the critical section.
> > - We have just seen a successful MaybeStat value.
> > - `needsRead()` will be `true` since we have not read contents before. We will immediately try to read.
> > - `read()` should open the original contents and can safely:
> >     - on success, update the value for MaybeStat to match the observed size
> >     - on failure, drop the value and set the error for MaybeStat to the observed error
> > - When we leave the critical section, either:
> >     - MaybeStat stores an error; no thread will enter the critical section again
> >     - OriginalContents are initialized and `needsRead()` returns false
> > 
> > In the "incrementally updates" case:
> > - `needsRead()` returns false so `read()` will not be called
> > 
> The key property being the last bullet of the first case: that the "create and partially initializes" case guarantees that either `MaybeStat` stores an error (`isReadable()` is false) or `OriginalContents` is initialized (`needsRead()` returns false).
> 
> ----
> 
> I think I've found the part I was missing: that this critical section is for a SharedCacheFileEntry (associated with a filename), but the `OriginalContents` is a field on a CachedFileContents which could apply to other UIDs (and `SharedCache.getFileContents()` is actually "get or create"). Since this commit creates the UID map, seems like maybe the race gets worse in this commit? (Not sure)
> 
> ----
> 
> Another problem: the UID can change between the first stat above (call to `getUnderlyingFS().status(Filename)`) and the one inside `read()` if another process is writing at the same time. We can't trust the UID mapping from the first `status()` call unless content already exists for that UID.
> 
> I think to avoid this race you need to delay creating "UID to content" map entry until there is the result of a successful `read()` to store.
> 
> I'll describe an algorithm that I think is fairly clean that handles this. I'm using different data structure names to avoid confusion since I've broken it down a little differently:
> - ReadResult: stat (for directories) OR error and uid (failed read) OR stat and content (and optional minimized content and pp ranges, and a way to update them atomically))
> - FilenameMap: map from Filename to `ReadResult*` (shared and sharded; mirrored locally in each worker)
> - UIDMap: map from UID to `ReadResult*` (shared and sharded; probably no local mirror)
> 
> And here's the algorithm:
> ```
> lang=c++
> // Top-level API: get the entry/result for some filename.
> ErrorOr<ReadResult &> getOrCreateResult(StringRef Filename) {
>   if (ReadResult *Result = lookupEntryForFilename(Filename))
>     return minimizeIfNecessary(*Result, ShouldMinimize);
>   if (ErrorOr<ReadResult &> Result = computeAndStoreResult(Filename))
>     return minimizeIfNecessary(*Result, ShouldMinimize);
>   else
>     return Result.getError();
> }
> // Compute and store an entry/result for some filename. Returned result
> // has in-sync stat+read info (assuming read was successful).
> ErrorOr<ReadResult &> computeAndStoreResult(StringRef Filename) {
>   ErrorOr<Status> Stat = UnderlyingFS->status(Filename);
>   if (!Stat)
>     return Stat.getError(); // Can't cache missing files.
>   if (ReadResult *Result = lookupEntryForUID(Stat->UID))
>     return storeFilenameEntry(Filename, *Result); // UID already known.
>   // UID not known. Compute a ReadResult.
>   //
>   // Unless this is a directory (where we don't need to go back to the FS),
>   // ignore existing 'Stat' because without an open file descriptor the UID
>   // could change.
>   Optional<ReadResult> Result;
>   if (Stat->isDirectory())
>     Result = ReadResult(*Stat);
>   else if (ErrorOr<ReadResult> MaybeResult = computeReadResult(Filename))
>     Result = std::move(*MaybeResult);
>   else
>     return MaybeResult.getError(); // File disappeared...
>   // Store the result. Cascade through UID then Filename. Each level could
>   // return a different result than it was passed in.
>   return storeEntryForFilenameOrReturnExisting(Filename,
>              storeEntryForUIDOrReturnExisting(std::move(*Result));
> }
> // Lookup existing result in FilenameMap. No mutation. First checks local map
> // then falls back to the shared map (locks shard, lookup, unlocks, saves in
> // local map, returns).
> ReadResult *lookupEntryForFilename(StringRef Filename);
> // Lookup existing result in UIDMap. No mutation. No local map, just a shared
> // map (lockshard+lookup+return).
> ReadResult *lookupEntryForUID(UniqueID);
> // Compute read result using a single file descriptor.
> // - Return error if `open()` fails. Can't cache missing files.
> // - Else compute ReadResult: Stat open file descriptor and get a memory buffer from it.
> // Note: "Error" state if stat fails.
> // Note: "Error" state if stat succeeds and memory buffer does not open.
> // Note: if the memory buffer opens successfully, status updated with observed size.
> // Note: does not take a UID parameter since live FS could have changed.
> // Note: does not access or mutate UIDMap/FilenameMap/etc.
> ErrorOr<ReadResult> computeReadResult(StringRef Filename);
> // Compare-exchange. Pulls UID out of NewResult. Locks shard for UIDMap[UID]; checks for
> // existing result; if none, bump-ptr-allocates and stores NewResult; returns stored
> // result.
> ReadResult& storeEntryForUIDOrReturnExisting(ReadResult &&NewResult);
> // Compare-exchange. Locks shard for FilenameMap[Filename]; checks for existing result;
> // if none, stores parameter; unlocks; updates local map with stored result and returns
> // it.
> ReadResult& storeEntryForFilenameOrReturnExisting(StringRef Filename, ReadResult &);
> // If needed and missing, adds minimization info atomically. Note that Result
> // may store a cached read error, or a directory.
> ReadResult& minimizeIfNecessary(ReadResult& Result, bool ShouldMinimize);
> ```
> The only thing "lost" is that two workers might both compute a ReadResult for the same file (the slower one having the work dropped on the floor). I'm skeptical this will matter in practice. If some measurement says it does, the FilenameMap could map from Filename to `unique_ptr<pair<mutex,atomic<ReadResult*>>>` and `computeAndStoreResult()` could take a lock at the start and re-check the map... but IMO it's better to make this simple and optimize for the common case.
> 
> This does leave behind an extended critical section in `minimizeIfNecessary()` to avoid racing to minimize, implying there's a mutex in `ReadResult` for updating the MinimizedContents and PPRanges. (I suspect minimization is fast enough (and racing workers rare and cheap enough) that the memory overhead of a per-ReadResult mutex is a bad tradeoff (a simple alternative would be to computeMinimized before the critical section, then take out a "store-minimization" lock just for setting the values (mutex could be sharded by the `UID % 64` or something), but I'm less confident.)
> 
> This also leaves behind the double-stat behaviour (before and after open) for new files. New files should be pretty rare in the depscanner so maybe this is fine; I've also observed cases where failed `open` is slower than failed `stat`, so for the common case of missing files (which don't get cached...) this might be best.
This patch (and D115346) were motivated by D114971, which prevents minimization of symlinks that point to files referenced by precompiled dependencies (e.g. a PCH).

When the dependency scanning worker disables minimization of a file referenced by a precompiled dependency, my idea was to immediately "canonicalize" the filename to `UniqueID`s (through stat) and use that when deciding whether to minimize its contents in `getOrCreateFileSystemEntry`. With that approach, the "filename -> `UniqueID`" map/cache acts as the authority for stat information. However, I can see how this breaks when the FS is volatile. Besides the issue outlined in your last comment, the current approach in D114971 prolongs the pause between initial stat (when disabling file minimization - "configure time") and read ("query time"), increasing the chances for observing filesystem volatility.

I think your approach makes a lot of sense if we want to be really defensive against volatile FS. Making sure we don't have to re-stat or read files that were already stat-ed during "configuration" or previous "query" would be nice. I think that unfortunately means we need to actually **read** all input files of precompiled dependencies in D114971. Just stat-ing such files is no longer an option, since that would get us back to square one if we need to later read them (they might've changed).

I have implemented your idea locally, and will update this patch tomorrow.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114966/new/

https://reviews.llvm.org/D114966