[PATCH] D136651: [Clang] Give Clang the ability to use a shared stat cache

Ben Langmuir via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 16 12:36:51 PST 2022


benlangmuir requested changes to this revision.
benlangmuir added inline comments.
This revision now requires changes to proceed.


================
Comment at: clang/include/clang/Driver/Options.td:3374
+def ivfsstatcache : JoinedOrSeparate<["-"], "ivfsstatcache">, Group<clang_i_Group>, Flags<[CC1Option]>,
+  HelpText<"Use the stat data cached in file instead of doing filesystemsyscalls. See clang-stat-cache utility.">;
 def ivfsoverlay : JoinedOrSeparate<["-"], "ivfsoverlay">, Group<clang_i_Group>, Flags<[CC1Option]>,
----------------
"filesystemsyscalls" -> "filesystem syscalls"


================
Comment at: clang/lib/Frontend/CompilerInvocation.cpp:5117
+    llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> Buffer =
+        llvm::MemoryBuffer::getFile(File);
+    if (!Buffer) {
----------------
We should use `BaseFS->getBufferForFile` here in case the BaseFS is not the real filesystem. This matches how vfsoverlay works below and could avoid re-reading this file repeatedly during scanning where we have a caching filesystem at the base.


================
Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:176
+  if (EC) {
+    llvm::errs() << "fstat failed\n";
+    return false;
----------------
This could use more description of the error.


================
Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:291
+  IsCaseSensitive =
+      ::pathconf(TargetDirectory.c_str(), _PC_CASE_SENSITIVE) == 1;
+#endif
----------------
Is this pathconf extension Darwin-only?


================
Comment at: clang/tools/clang-stat-cache/clang-stat-cache.cpp:300
+  auto endTime = llvm::TimeRecord::getCurrentTime();
+  endTime -= startTime;
+
----------------
Nit: move subtraction to the print or create a new "duration" variable instead of mutating endTime. Same below for writeStatCache.


================
Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1113
+/// A ProxyFileSystem using cached information for status() rather than going to
+/// the real filesystem.
+///
----------------
s/real/underlying/ - it could be wrapping any filesystem in principle.


================
Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1116
+/// When dealing with a huge tree of (mostly) immutable filesystem content
+/// like and SDK, it can be very costly to ask the underlying filesystem for
+/// `stat` data. Even when caching the `stat`s internally, having many
----------------
typo: "and SDK" -> "an SDK"


================
Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1141
+  static Expected<IntrusiveRefCntPtr<StatCacheFileSystem>>
+  create(std::unique_ptr<llvm::MemoryBuffer> &&CacheBuffer,
+         StringRef CacheFilename, IntrusiveRefCntPtr<FileSystem> FS);
----------------
Nit: It's unusal to see `unique_ptr &&` instead of just `unique_ptr`.  Is this needed?  Same for the constructor.


================
Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1142
+  create(std::unique_ptr<llvm::MemoryBuffer> &&CacheBuffer,
+         StringRef CacheFilename, IntrusiveRefCntPtr<FileSystem> FS);
+
----------------
Is `CacheFilename` different from `CacheBuffer->getBufferIdentifier()`?


================
Comment at: llvm/include/llvm/Support/VirtualFileSystem.h:1167
+    StatCacheWriter(StringRef BaseDir, bool IsCaseSensitive,
+                    uint64_t ValidityToken = 0);
+    ~StatCacheWriter();
----------------
Currently we are not adding the SDK base directory itself to the cache, which causes `status(BaseDir)` to fail.  I think we should add a status parameter to the constructor of `StatCacheWriter` to force it to be included.  This case also needs a test.


================
Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2816
+
+class StatCacheFileSystem::StatCacheLookupInfo {
+public:
----------------
Does this depend on other non-public code from this file, or could we split it into StatCacheFileSystem.cpp?  Obviously there is a lot of code here already, but would be nice to break the trend if it's easy to do so.


================
Comment at: llvm/lib/Support/VirtualFileSystem.cpp:2907
+//                             // whether the cache is up-to-date.
+//    char     BaseDir[N];     // Zero terminated path to the base directory
+//    < OnDiskHashtable Data > // Data for the has table. The keys are the
----------------
I think this comment is missing 'Version', which was added later.


================
Comment at: llvm/lib/Support/VirtualFileSystem.cpp:3050
+  // Move back to right after the Magic to insert BucketOffset
+  Out.seek(4);
+  Out.write((const char *)&BucketOffset, sizeof(BucketOffset));
----------------
Nit: could we use pwrite here and then change this method to use the more-general `raw_pwrite_stream` instead of being fd_ostream-specific?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136651/new/

https://reviews.llvm.org/D136651



More information about the cfe-commits mailing list