[PATCH] D30691: [analyzer] Support for naive cross translational unit analysis

Thu Jun 15 11:19:28 PDT 2017

zaks.anna added a comment.

> -(Anna) Scan-build-py integration of the functionality is nearly finished (see https://github.com/rizsotto/scan-build/issues/83) (--ctu switch performs both analysis phases at once). This I think could go in a different patch, but until we could keep the ctu-build.py and ctu-analyze.py scripts. Do you agree?

It's important to bring this patch into the LLVM repo so that it becomes part of the clang/llvm project and is used. The whole point of adding CTU integration to scan-build-py is to make sure that there is a single tool that all/most users could use; adding the patch to a fork does not accomplish that goal. Also, I am not a fan of developing on downstream branches and that is against the LLVM Developer policy due to all the reasons described here: http://www.llvm.org/docs/DeveloperPolicy.html#incremental-development. This development style leads to fragmentation of the community and the project. Unfortunately, we often see cases where large patches developed out of tree never make it in as a result of not following this policy and it would be great to avoid this in the future.

> This I think could go in a different patch, but until we could keep the ctu-build.py and ctu-analyze.py scripts. Do you agree?

It would be best to just add the scan-build-py support to the tree, especially, since the new scrips are not tested.

> -(Anna) Dumping the ASTs to the disk. We tried a version where, ASTs are not dumped in the 1st phase, but is recreated each time a function definition is needed from an external TU. It works fine, but the analysis time went up by 20-30% on open source C projects.

I am curious which optimization strategies you considered. An idea that @NoQ came up with was to serialize only the most used translation units. Another idea is to choose the TUs that a particular file has most dependencies on and only inline functions from those.

What mode would you prefer? Would you pay the 20%-30% in speed but reduce the huge disk consumption? That might be a good option, especially, if you have not exhausted the ideas on how to optimize.

> Is it OK to add this functionality in a next patch? Or should we it as an optional feature right now?

This depends on what the plan for going forward is. Specifically, if we do not need the serialization mode, you could remove that from this patch and add the new mode. If you think the serialization mode is essential going forward, we could have the other mode in a separate patch. (It would be useful to split out the serialization mode work into a separate patch so that we could revert it later on if we see that the other mode is better.)

I see some changes to the compiler proper, such as ASTImporter, ASTContext, SourceManager. We should split those changes into a separate patch and ask for review from people working on those components. You can point back to this patch, which would contain the analyzer related changes, and state that the other patch is blocking this work.

https://reviews.llvm.org/D30691