[PATCH] D121712: [clangd] Track time spent in filesystem ops during preamble builds

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 16 04:31:59 PDT 2022


kadircet added a comment.

Thanks, LG in general. I suppose the extra timing overhead should be negligible, considering the preamble build times but would be nice to see some numbers if you have any.
Apart from that I've raised the concern around multiple preamble builds finishing in parallel and poisoning the first_build metric, wanted to raise it here again to make sure it isn't lost in details.



================
Comment at: clang-tools-extra/clangd/FS.cpp:128
+  TimerFile(std::shared_ptr<WallTimer> Timer, std::unique_ptr<File> InnerFile)
+      : Timer(Timer), InnerFile(std::move(InnerFile)) {}
+
----------------
nit: Timer(std::move(Timer))


================
Comment at: clang-tools-extra/clangd/FS.cpp:141
+  }
+  std::error_code close() override { return InnerFile->close(); }
+
----------------
i guess this is negligible but, timing `close` operations too shouldn't hurt?


================
Comment at: clang-tools-extra/clangd/FS.h:16
 #include "llvm/ADT/StringMap.h"
+#include "llvm/Support/Timer.h"
 #include "llvm/Support/VirtualFileSystem.h"
----------------
i don't think we need this one


================
Comment at: clang-tools-extra/clangd/FS.h:71
 
+/// Used for tracking time spent in FS operations. Like llvm::Timer, but only
+/// tracks wall time, which is much cheaper.
----------------
I don't think mentioning `FS` here is worth it. Can we rather emphasize on this being an accumulating timer?


================
Comment at: clang-tools-extra/clangd/FS.h:71
 
+/// Used for tracking time spent in FS operations. Like llvm::Timer, but only
+/// tracks wall time, which is much cheaper.
----------------
kadircet wrote:
> I don't think mentioning `FS` here is worth it. Can we rather emphasize on this being an accumulating timer?
what about making all of this an implementation detail of `Preamble.cpp` ? we can lift it up once we have other places that we want to make use of.


================
Comment at: clang-tools-extra/clangd/FS.h:73
+/// tracks wall time, which is much cheaper.
+/// Since this is a generic timer, We may want to move this to support/ if we
+/// find a use case outside of FS time tracking.
----------------
maybe put a `FIXME:`


================
Comment at: clang-tools-extra/clangd/FS.h:78
+  WallTimer();
+  // [Re-]Start the timer.
+  void startTime();
----------------
nit: we use `///` for public comments.


================
Comment at: clang-tools-extra/clangd/FS.h:82
+  void stopTime();
+  // Return total time, in seconds.
+  double getTime();
----------------
what about returning an `int` with `ms` granularity ?


================
Comment at: clang-tools-extra/clangd/FS.h:86
+private:
+  std::chrono::steady_clock::duration TotalTime;
+  std::chrono::steady_clock::time_point StartTime;
----------------
what about just storing double/int ?


================
Comment at: clang-tools-extra/clangd/FS.h:91
+IntrusiveRefCntPtr<llvm::vfs::FileSystem>
+getTimeTrackingFS(std::shared_ptr<WallTimer> Timer,
+                  IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS);
----------------
i feel like this deserves the `FS` specific comment mentioned above. maybe something like:
```
This will record all time spent on IO operations in \p Timer.
```


================
Comment at: clang-tools-extra/clangd/FS.h:91
+IntrusiveRefCntPtr<llvm::vfs::FileSystem>
+getTimeTrackingFS(std::shared_ptr<WallTimer> Timer,
+                  IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS);
----------------
kadircet wrote:
> i feel like this deserves the `FS` specific comment mentioned above. maybe something like:
> ```
> This will record all time spent on IO operations in \p Timer.
> ```
i don't think we ever want concurrent access to Timer, i.e. startTime should never be called without a matching call to stopTime first. passing it in as a shared_ptr somehow gives the feeling that it might be shared across multiple objects, which might do whatever they want with the object.
maybe just pass in a reference ?


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:379
   auto StatCache = std::make_unique<PreambleFileStatusCache>(AbsFileName);
+  auto FSTimer = std::make_shared<WallTimer>();
+  auto TimedFS = getTimeTrackingFS(FSTimer, StatCache->getProducingFS(VFS));
----------------
what about only conditionally doing these when `Stats` is not null?


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:382
+
+  llvm::Timer PreambleTimer;
+  PreambleTimer.startTimer();
----------------
why use an expensive `llvm::Timer` if all we care is walltime?


================
Comment at: clang-tools-extra/clangd/Preamble.cpp:385
   auto BuiltPreamble = PrecompiledPreamble::Build(
-      CI, ContentsBuffer.get(), Bounds, *PreambleDiagsEngine,
-      StatCache->getProducingFS(VFS),
+      CI, ContentsBuffer.get(), Bounds, *PreambleDiagsEngine, TimedFS,
       std::make_shared<PCHContainerOperations>(), StoreInMemory, CapturedInfo);
----------------
it might be worth leaving some comments around this `TimedFS` being exposed to outside through PreambleCallbacks. We provide access to Preprocessor, which keeps the underlying FileManager alive.
Today all of this happens in serial hence all is fine, but we were discussing the possibility of performing the indexing and preamble serialization in parallel, it might result in some surprising race conditions if we do that.


================
Comment at: clang-tools-extra/clangd/Preamble.h:82
+struct PreambleBuildStats {
+  // Total wall time it took to build preamble, in seconds.
+  double TotalBuildTime;
----------------
nit: again `///` (i see that fields of PreambleData doesn't follow the style here, but it's the outlier)


================
Comment at: clang-tools-extra/clangd/Preamble.h:95
               const ParseInputs &Inputs, bool StoreInMemory,
+              PreambleBuildStats *Stats,
               PreambleParsedCallback PreambleCallback);
----------------
nit: maybe make this the last parameter and default to `nullptr` to get rid of changes in tests.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:100
 
+// Tracks latency (in seconds) of FS operations done during a preamble build.
+// build_type allows to split by expected VFS cache state (cold on first
----------------
can we move all of these into an anonymous namespace instead?


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:109
+// preamble build time. build_type is same as above.
+constexpr trace::Metric PreambleBuildFilesystemLatencyRatio(
+    "preamble_fs_latency_ratio", trace::Metric::Distribution, "build_type");
----------------
what about just tracking the total build time here? we can get the ratio afterwards.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:113
+static void reportPreambleBuild(const PreambleBuildStats &Stats,
+                                const bool IsFirstPreamble) {
+  // The first preamble we build in a clangd instance gets a different metric
----------------
nit: llvm style doesn't really use `const` if the parameter is being copied anyway


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:121
+                  : (IsFirstPreamble ? "first_build_for_file" : "rebuild");
+  FirstReport = false;
+
----------------
this can still be called in parallel as we have multiple preamble threads. moreover this also makes "first_build" tracking somewhat tricky. do you think this metric would still be useful in the presence of such noise? e.g. a respawning clangd instance might receive addDocument requests for all the previously open files and the first preamble to build might've benefited a lot from cached file IO that was done by other preamble builds.

i suppose we can have something like following to at least address multiple preamble threads accessing `FirstReport` issue.
```
static bool ReportedFirstBuild = [&]{
  PreambleBuildFilesystemLatency.record(Stats.FileSystemTime, "first_build"); // I think it's fine to report twice here, as we've both performed a first build for a file, and it was the first build of clangd instance.
  return true;
}();
llvm::StringLiteral Label = IsFirstPreamble ? "first_build_for_file" : "rebuild";
PreambleBuildFilesystemLatency.record(Stats.FileSystemTime, Label);
if (Stats.TotalBuildTime > 0) // Avoid division by zero.
    PreambleBuildFilesystemLatencyRatio.record(
        Stats.FileSystemTime / Stats.TotalBuildTime, Label);
```


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1008
+  PreambleBuildStats Stats;
+  const bool IsFirstPreamble = !LatestBuild;
   LatestBuild = clang::clangd::buildPreamble(
----------------
nit: drop the const per style.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1015
       });
   if (LatestBuild && isReliable(LatestBuild->CompileCommand))
     HeaderIncluders.update(FileName, LatestBuild->Includes.allHeaders());
----------------
nit:
```
if(!LatestBuild)
  return;
reportPreambleBuild(Stats, IsFirstPreamble);
if(isReliable(LatestBuild->CompileCommand))
  HeaderIncluders.update(FileName, LatestBuild->Includes.allHeaders());
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121712/new/

https://reviews.llvm.org/D121712



More information about the cfe-commits mailing list