[clangd-dev] [cfe-dev] RFC: prototype of clang-scan-deps, faster dependency scanning tool for explicit modules and clangd

Thu Nov 8 02:13:24 PST 2018

+clangd-dev at lists.llvm.org <clangd-dev at lists.llvm.org>

On Tue, Nov 6, 2018, 19:23 David Blaikie via cfe-dev <cfe-dev at lists.llvm.org
wrote:

> Thanks for sending this out!
>
> Yeah, I'm super interested in how (future standard) C++ modules will
> interact with build systems, as it's unlikely to be feasible to use an
> implicit compilation model (in part because of the code generation/linkage
> requirements - you could put everything from a C++ Modules definition in
> comdats, etc as is done for headers today (rather than in separate object
> files), but not really how it's meant to work).
>
> All models boil down to something like this - the build system having some
> explicit knowledge (through library dependencies within a project) and
> having to do some discovery (to find external dependencies (the standard
> library (if/once modularized and used as such), other external libraries
> written using modules) and to reduce internal dependencies (not all code in
> one library depends on all the libraries that library depends on - so by
> discovering the specific modular imports used in a given module, that
> module may be able to be built sooner (when only some of its libraries
> dependencies have been built, because it only needs that subset)) before
> executing any compilation steps (& then, ideally, passing around the
> compilation inputs/outputs rather than relying on the compiler to discover
> them itself in a cache directory or the like).
>
> You mentioned a few performance metrics
> Up to 10x speedup in non-modular dependency scanning - what do you mean by
> non-modular dependency scanning? (what's the non-modular part - in contrast
> to?)
> 4x when run on the first 1000 files in Clang's compilation database,
> compared to clang -Eonly - so this is running the whole tool, including
> generating the trimmed preprocessed files, and then reading those to
> discover the header module dependencies, compared to running -Eonly, then
> scanning those files? & the output is currently in what form? .d-like files?
>
> You mention relying on the compilation database for discovering the files
> to run over - is this the long term goal/design, or a current stepping
> stone? I was about to say that seems circular (thinking that the
> compiler/compilation phase generates the compilation database) but then
> realized/remembered that it's the build system that generates that, not the
> compiler, so you can have/use/run over the compilation database before
> compilation has begun. Sounds good. So the build system would have to have
> a phase that runs after generating the compilation database that runs this
> tool, then adds the module compilations produced by this tool to the list
> of commands it will execute (& probably also adds them back into the
> compilation database, too, really).
>
> So, as you mentioned (maybe in the phab review), the format of the output
> of this tool is still unknown, but the input is currently a (currently the
> classic json, I assume - but if the tool uses the compilation database
> access APIs, other sources implemented in that API could be used)
> compilation database - cool cool.
>
> Thanks again!
>
> - Dave
>
> On Tue, Oct 16, 2018 at 6:53 PM Alex L <arphaman at gmail.com> wrote:
>
>> Hi,
>>
>>
>> Bruno (CCed), Duncan (CCed) and I have been exploring if we can migrate
>> some of our clients to explicit modules. As part of this work Duncan and I
>> developed a new prototype dependency scanning service tool
>> (clang-scan-deps) that computes the set of file dependencies for a
>> particular compiler invocation using some optimizations that are outlined
>> below. This tool makes the non-modular dependency scanning up to 10 times
>> faster for particular workloads (e.g. llc target, 1542 C++ files) on one of
>> our machines, when compared to parallel invocations of clang with -Eonly.
>> We are still in the early stages of proper modules support, but our initial
>> crude prototype can get up to 4x when run on the first 1000 files from
>> clang’s compilation database for a build of LLVM with modules turned on.
>>
>>
>> We still run the full Clang preprocessor. Here’s what we do to reduce its
>> workload:
>>
>>    - Minimize sources by stripping away unused tokens. We keep only the
>>    interesting PP directives (#define, #if, #include, etc.), i.e. those that
>>    might impact the set of dependencies.
>>    - Assume the filesystem is immutable for one run of the service, and
>>    cache the files and their minimized contents in memory in a global cache.
>>    - Skip over excluded preprocessor ranges by bumping up the buffer
>>    pointer in the lexer instead of lexing the skipped tokens.
>>
>>
>> We intend to upstream this service in the upcoming months. We also would
>> like to integrate this service into Clangd as part of our migration to
>> Clangd to help us determine a good compilation command for a header file
>> from a set of known compilation invocations.
>>
>>
>> I posted a very rough WIP patch on Phabricator (
>> https://reviews.llvm.org/D53354). It’s based on LLVM checkout r343343.
>> Please take a look if you’re interested.
>>
>> Duncan, Bruno and I will be at the LLVM dev meeting. We are interested in
>> discussing this prototype and collecting feedback from anyone who might be
>> interested in this work.
>>
>>
>> Thanks,
>>
>> Alex
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/clangd-dev/attachments/20181108/f9373303/attachment.html>