[PATCH] D137534: [C++20] [Modules] [ClangScanDeps] Allow clang-scan-deps to without specified compilation database in P1689 (3/4)

Ben Boeckel via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jan 9 10:32:54 PST 2023


ben.boeckel added a comment.

In D137534#4037064 <https://reviews.llvm.org/D137534#4037064>, @jansvoboda11 wrote:

> Another thing to be aware of is that the scanner is tuned for scanning multiple TUs. Single `clang-scan-deps` invocation maintains a shared in-memory cache of the filesystem between its threads for all TUs it's given. This means invoking `clang-scan-deps` once for each TU is not as efficient as it could be. At Apple, we only use `clang-scan-deps` for the in-tree tests. In production, we actually wrap the C++ interface and expose a libclang API that is able to take advantage of caching to improve performance. To be honest, I'm surprised `clang-scan-deps` is being integrated into build systems as-is, especially without utilizing the cache. How's the performance looking for larger projects?

I originally investigated (and ended up lost) for something like GCC where P1689 <https://reviews.llvm.org/P1689> information is extracted via `-E -fdep-file=p1689.json -fdep-output=module.o -fdep-format=p1689` (which is "abusing" `-E`, but works), but using `clang-scan-deps` was where we ended up after a more-knowledgeable LLVM developer took over knowing what needed to be done.

>> - because it is a rule that can itself read extra files that affect the scanning; this is the `-MF`-style output so that make/ninja can know "oh, frabnitz.h changed, it can affect the scan results in glom.ddi, so I will rescan")
>
> I see. So the P1689 <https://reviews.llvm.org/P1689> output is the primary scanner output, but you're also relying on emitting `.d` files to track the actual FS dependencies.

Yes, that's exactly it. It'd be great if *every* command had `-MF`-style information for more accurate builds, but I'll roll that rock up another hill some other day.

>> The object can be obtained from the `-o` on the command line, but the rest is "lying" if it is extracted from the clang command line and not given to `clang-scan-deps` directly.
>
> I see your point. But since `clang-scan-deps` is built around reusing the same FS cache for scanning multiple TUs, you'd need to specify these arguments (that we currently extract from Clang command line) for all of those TUs. That's not very convenient, neither through the command line nor via a separate config file.

While batch scanning is probably better for one-shot (basically, CI) builds, I suspect the excess work during development/incremental builds will cause that to "lose" over a long enough time span.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137534/new/

https://reviews.llvm.org/D137534



More information about the cfe-commits mailing list