<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div dir="auto" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Hello,</div><div class=""><br class=""></div><div class="">Our goals for the code coverage BoF (10/19) were to find areas where we can improve the coverage tooling, and to learn more about how coverage is used. I'd like to thank all of the attendees for their input and for making the BoF productive. Special thanks to Mandeep Grang, who volunteered as a mic runner at the last minute.</div><div class=""><br class=""></div><div class="">In this email I'll share my (rough) notes and outline some future plans. Please feel free to ask for clarifications or to add your own notes.</div><div class=""><br class=""></div><div class="">Here are the slides from the BoF:</div><div class=""><a href="https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p" class="">https://docs.google.com/presentation/d/e/2PACX-1vS-rV02j1zhPq9Y6AtcUkbZW2c7Q5YYuQ6FPxN-aYiKwrw6c8DU3zW_RYeJlWPMZ5-S6hgz_CIcL8Gd/pub?start=false&loop=false&delayms=3000&slide=id.p</a></div><div class=""><br class=""></div><div class="">1. The header problem</div><div class=""><br class=""></div><div class="">Coverage instrumentation overhead is roughly quadratic in the number of translation units in a project. The problem is that coverage mappings for template instantiations and static inline functions from headers are pulled into every TU. This bloats the profile metadata sections (which can slow down profile I/O), results in large binaries, and causes long link times (or link failures).</div><div class=""><br class=""></div><div class="">We could solve this problem by maintaining an external coverage database and discarding duplicate coverage mappings from the DB. Another idea is to emit coverage mappings to a side file and unique them when generating coverage reports. Both ideas require changes to the build workflow.</div><div class=""><br class=""></div><div class="">A third option is to emit named coverage mappings with linkonce_odr linkage (for languages with an ODR). This would be a format-breaking change but it wouldn't affect the build workflow. My plan is to try and evaluate this idea in the coming week.</div><div class=""><br class=""></div><div class="">2. HTML report quality</div><div class=""><br class=""></div><div class="">There seems to be widespread interest in improving the quality of coverage reports. We need volunteers to work on this and would love your help! Here are some desired features:</div><div class=""><br class=""></div><div class="">* Search and filtering for coverage summaries</div><div class="">* Collapsing parts of a coverage summary by subdirectory</div><div class="">* Automatically generating a top 10 list of code regions which need better coverage</div><div class="">* Searching via complex queries (e.g: 'give me uncovered regions in covered lines', or 'give me uncovered regions after a call')</div><div class="">* Generating coverage deltas between two profiles, and identifying coverage regressions in a patch/commit</div><div class="">* Simplified tracking of coverage trends over time</div><div class=""><br class=""></div><div class="">There is some consensus that this functionality should not be built on top of the existing llvm-cov C++ codebase. It might be better to develop these features in a language more amenable to rapid prototyping and interoperation with popular web application frameworks (perhaps Python). To facilitate this, llvm-cov gained support for exporting all of its data to JSON (see <a href="https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-cov/CoverageExporterJson.cpp" class="">CoverageExporterJson.cpp</a>). If you are interested in working on these features, I would be happy to work with you on design issues and on code review.</div><div class=""><br class=""></div><div class="">3. Optimizing profile counter placement</div><div class=""><br class=""></div><div class="">From Eli's notes:</div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class=""><div class="">I remember we also spent some time discussing the counter intrinsics, and whether we could produce a different set of intrinsics in the frontend, and produce the counters later in the pipeline to avoid duplicate counters. I didn't completely follow that discussion; I haven't spent much time looking at the counter intrinsics or how they're lowered. </div></div></blockquote><br class=""></div><div class=""><div class="">Just to recap: the frontend emits calls to the llvm.instrprof_increment intrinsic to implement counter updates. Each increment intrinsic is passed a function name and a counter index (there's a mapping between AST nodes and counter indices). The intrinsics are lowered in the InstrProfiling pass. During lowering, an array of uint64_t counters is created for each function, and the intrinsic calls are replaced by a load-add-store pattern.</div><div class=""><br class=""></div><div class="">Frontend counter updates can look highly redundant because of inlining. It's common to see single basic blocks with tens of distinct counter updates, most of which are redundant. One potential solution is to create a minimal set of profile counter updates after the inliner runs, and to map these counters back to AST nodes (<a href="https://bugs.llvm.org/show_bug.cgi?id=33500" class="">https://bugs.llvm.org/show_bug.cgi?id=33500</a>). This is the most promising approach we know of to cut down on counter updates, but I don't have a precise idea of how it would work. Here's a rough sketch of a solution:</div><div class=""><br class=""></div><div class="">* Have the frontend emit 'virtual' llvm.instrprof_increment intrinsics. These will eventually be discarded during lowering.</div><div class="">* Run an early inlining step, then run the IR PGO pass.</div><div class="">* In the lowering step, emit a section into the object which describes how to map the real counter updates to the virtual ones. I don't have a clear idea of how to build or encode this mapping.</div><div class="">* Teach llvm-profdata how to reconstruct an indexed profile which the frontend can understand (i.e map the real counters back to the virtual ones). llvm-profdata would need to inspect the mapping section in the binary to accomplish this.</div></div><div class=""><br class=""></div><div class="">4. Optimizing profile counter updates</div><div class=""><br class=""></div><div class="">We had a few different suggestions to speed up profile counter updates:</div><div class=""><br class=""></div><div class="">* Make function counter arrays linkonce_odr when possible. This is similar to the solution from the first section ("The header problem"). I'll try to evaluate this idea in the coming week.</div><div class="">* Enable register promotion for counter updates which occur within loops. David Li has already done the work to enable this for IR PGO.</div><div class="">* Investigate the # of relocations emitted for counter updates. It might be cheaper to load the address of the function counter array once and index into it, instead of indexing into the global on each update.</div><div class="">* Use 32-bit counters. This would cut the size of the counters section in half and speed up profile I/O.</div><div class="">* Use 1-bit counters. This could be useful for those who are only interested in binary coverage. IMO there are other ideas we should try before compromising on report accuracy.</div><div class="">* Use saturating counters. IMO this isn't likely to be a win in common cases, but could increase compile time and code size.</div><div class=""><br class=""></div><div class="">5. Using coverage interactively while hacking on llvm</div><div class=""><br class=""></div><div class="">During the BoF I mentioned that it can be really useful to see coverage reporting interactively, as you're working on a patch. Here's a hacky way to do this:</div><div class=""><br class=""></div><div class="">* Build your code as you normally would (say, "ninja opt")</div><div class="">* Change the files you're interested in</div><div class="">* cd to your build directory and export CCC_OVERRIDE_OPTIONS="+-fcoverage-mapping +-fprofile-instr-generate=/tmp/opt_%m.profraw"</div><div class="">* Rebuild ("ninja opt" again). This will enable coverage instrumentation, but only for the files you've affected with your changes.</div><div class="">* Run a one-liner to generate a coverage report (<a href="http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports" class="">http://clang.llvm.org/docs/SourceBasedCodeCoverage.html#creating-coverage-reports</a>)</div><div class=""><br class=""></div><div class="">I like this approach because it means I don't have to maintain a separate, coverage-enabled build tree. It's an easy way to check that your patches have decent test coverage. If I want to disable coverage reporting I just need to unset CCC_OVERRIDE_OPTIONS and recompile.</div><div class=""><br class=""></div><div class="">6. C APIs for libCoverage</div><div class=""><br class=""></div><div class="">We didn't get a chance to discuss this in detail during the BoF, but I would like to upstream some C APIs to surface functionality from libCoverage. This will make it easier for IDEs and editors to display coverage information "in-line", right next to source code. Here's what that might look like:</div><div class=""><br class=""></div><div class=""><a href="https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html" class="">https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/testing_with_xcode/chapters/07-code_coverage.html</a></div><div class=""><br class=""></div><div class="">If anyone has concerns about adding in these APIs, please let me know!</div><div class=""><br class=""></div><div class="">7. Making use of debug info</div><div class=""><br class=""></div><div class="">From Eli's notes:</div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class=""><div class="">It seemed like we got a lot of questions related to why we aren't using debug info. :) It might be possible to come up with some sort of hybrid which trades off runtime overhead for lower resolution, without completely throwing away regions like gcov does. But it would be a big project, and the end result would still have a lot of the same problems as actual gcov in terms of the optimizer destroying necessary info.</div></div></blockquote><br class=""></div><div class="">To add to this: I think there are a lot of unanswered questions here. It's unclear how clang would decide to use debug info instead of regions, or how the different types of coverage counters would interact. I'm not very optimistic about this.</div><div class=""><br class=""></div><div class="">thanks,</div><div class="">vedant</div></div></body></html>