[PATCH] D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage

Tue May 11 16:45:25 PDT 2021

vsk added a subscriber: alanphipps.
vsk added a comment.

> Just so I'm clear, are you saying that every Counter and Expression generated is guaranteed to be added to the coverage map (so there are no index gaps), even if the counter is part of an unreachable (and therefore eliminated) branch/block?

Yes, that's traditionally how we've done it. For example, in clang, we walk the function AST, prepare all the requisite counters & expressions, and emit an immutable coverage mapping into the llvm IR all before optimization begins:

  // CodeGenPGO::assignRegionCounters
  mapRegionCounters(D);
  if (CGM.getCodeGenOpts().CoverageMapping)
    emitCounterRegionMapping(D);

> And, if so, I assume it's OK for a dead branch/block to be eliminated (so the instrprof.increment call for that branch's counter is never added.) But even if the code to increment that counter is not in the code, the coverage map includes the counter. Right?

Yes, exactly. So long as a single instrprof.increment call within a function survives optimization, the profile will contain a full set of region counts for the function. If no instrprof.increment calls survive optimization, llvm behaves as if each counter referenced by the function's coverage mapping is 0.

> I recently added support for reporting code regions for dead blocks, but I am replacing them with Zero counters. If I'm understanding you correctly, then I may be able to just retain the original counter id and/or expression and add them to the map instead of converting them to Zero counters, and therefore meet your expectations.

That sounds right. The expectation is that either the profile will contain a 0 count for the dead region/block, or (if the profile record is missing) llvm will infer that the entire function is dead. But this doesn't require any changes to the coverage mapping itself.

> And the assertion error message needs to explain the problem (something like, "non-contiguous counter IDs are not allowed")

Right, I might suggest `assert(getMaxCounterID(...) < Record.MappingRegions.size() && "Didn't expect to see more regions-with-counters than regions")`.

> Also, I want to confirm that it is OK to have counters that are incremented (via instrprof.increment intrinsic) that are not associated with a code region.

Ah, interesting! I think that is OK. However it does mean that the assert written above is broken, and that we'd want llvm to become more permissive (along the line of @pirama's patch).

(An aside: maybe this is only of historical interest, but I suspect it's somewhat of an accident that this works at all. As far as I know, it was always assumed that each counter was associated to a specific ::CodeRegion.)

I don't think dispensing with the assert is a big loss. The alternative would be to have the FE create a special CounterMappingRegion::EdgeRegion every time it creates a counter that otherwise wouldn't have a region; this would be wasteful unless the extra regions would actually be used.

> I'm not sure what clang or swiftc does, but in the Rust FE, I sometimes need to count edges between nodes in the CFG (branches taken, between two blocks) since the target block can be executed from multiple incoming edges

Clang recently picked up branch coverage support via (among other format changes) CoverageMappingRegion::BranchRegion. @alanphipps is the expert on this. Prior to Alan's change, clang simply didn't support branch/edge coverage.

> Edge counters exist so they can be operands in expressions, and since they don't have a code region, they would not be referenced directly by a code region in a report; but the expression might, and that's we we need them.

Thanks for sharing the example. For context, in clang, when there's a block of code that has multiple entry points, we either assign a fresh counter to that block or come up with a workable counter expression purely in terms of other block counts. That's not to say it's not useful or valid to use edge counters instead, though.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101780/new/

https://reviews.llvm.org/D101780