[cfe-dev] LibTooling performance windows question

Radu Angelescu via cfe-dev cfe-dev at lists.llvm.org
Mon May 25 23:24:00 PDT 2020


Hello,
It seems the problem with multi-threading was that the tool class was
changing the working directory. After using the appropriate
function: tool.setRestoreWorkingDir(false); (that actually has a comment
about this) it now works. (replying to my mail so anybody that is looking
for the same problem can see the fix).
About parsing optimizations so it can be made faster I will probably
investigate DependencyScanner and this clang example:
https://github.com/llvm-mirror/clang/blob/master/tools/clang-scan-deps/ClangScanDeps.cpp
.
If anybody has a simpler ideea, feel free to let me know.
Attaching the video that made me go on the correct path:
https://www.youtube.com/watch?v=Ptr6e4CVTd4 . (for somebody that may be
reading this email)

Cheers,
Radu


În sâm., 23 mai 2020 la 08:39, Radu Angelescu <
raduangelescu at raduangelescu.com> a scris:

> Hello,
>
> First, I want to congratulate the developers for a beautiful piece of
> software. Not only it is fast but it is also written beautifully and the
> code has incredibly good readability (even good comments :D ).
>
> The second point of this email: I am trying to do an automatic include
> fixer with libtooling. Everything seems to work fine right now but I am
> running it on a project where I need to parse around 2000 files (that have
> deeply nested includes) and it takes a long time so I am trying to make it
> faster.
> My current code uses only one preprocessor action (is kind of simple). It
> uses only the InclusionDirective callback and it will never need more (like
> ast and such).  When running this on a single thread the fixer takes 20
> minutes. To improve that time I tried giving compilation files in batches
> on multiple threads:
>
>     size_t batch_size = allInterestingSources.size() / c_num_threads;
>     std::vector<std::vector<std::string>> batches;
>     for (size_t i = 0; i < allInterestingSources.size(); i += batch_size) {
>         auto last = allInterestingSources.size() < i + batch_size ?
> allInterestingSources.size() : i + batch_size;
>         batches.emplace_back(allInterestingSources.begin() + i,
> allInterestingSources.begin() + last);
>     }
>     auto start = std::chrono::high_resolution_clock::now();
> #pragma omp parallel for num_threads(c_num_threads)
>     for (int i = 0; i < batches.size(); i++)
>     {
>         RefactoringTool  tool(db, batches[i]);
>         tool.run(newFrontendActionFactory<IncludeFinderAction>().get());
>     }
> So basically I am doing a refactor tool for each batch and running it. The
> time for this looked promising (around 5 minutes) but:
> When setting c_num_threads to 1 everything works like it should but it
> takes 20 minutes. When I set it to something like 10->16 threads the tool
> gives errors about opening files included from other files. (I think the
> tool opens some included files exclusively)
>
> *Important info: I am using Windows and some of the files I am parsing are
> read-only.*
>
> *My debugging try:* From what I looked in Path.inc file, it seems that
> all files are opened with FILE_SHARE... attribute but I don't know if I am
> missing some other implementation or something more deeply related to
> windows maybe.
>
> *My questions*:
> - Do you guys know why I am having this multithreading file-open issue?
> - Are there any tips and tricks for making this even faster? (maybe
> skipping some compiler steps, as I only need preprocessor ones.. maybe it
> already does that.. I am a beginner in libtooling and need some advice)
>
> *Note:* I am using a freshly compiled version: git clone
> https://github.com/llvm/llvm-project.git
>
> Thanks,
> Radu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200526/82bb99bd/attachment.html>


More information about the cfe-dev mailing list