[PATCH] D138846: MC/DC in LLVM Source-Based Code Coverage: LLVM back-end and compiler-rt

Tue Nov 29 15:13:10 PST 2022

ellis added a comment.

In D138846#3958331 <https://reviews.llvm.org/D138846#3958331>, @alanphipps wrote:

> Hi @ellis, thank you for looking at this deeper!  I'll update based on your comments below, but I want to respond to your primary question first:
>
>> Please correct me if I'm wrong, but my understanding is that we have a single bitvector per expression (or function?) to track coverage. And that these bitvectors are at most 64 bits long. Is the number of these bitvectors per function known at compile time? I think yes because you are allocating the `IPSK_bitmap` section at compile time.
>
> Yes, a bitvector (or bitmap) is allocated per expression for each function, and this is known at compile time. Each one can be at minimum 8 bits and 64bits max, although the max value could be extended in the future.
>
>> If this is the case then I would suggest extending the `IPSK_cnts` rather than creating a new `IPSK_bitmap` section. There is lots of logic to track the location of the `IPSK_cnts` and correlate its values to the correct functions (this is the main feature of Debug Info Correlation). Instead, we could allocate more data for function counters and say the last `N` bytes are for the MC/DC bitvectors. The lowered code would know where to write the data because they are given the address and the raw profile readers would know when to expect this data at the end of the counters because they know `N`.
>
> I totally understand the desire to avoid modifying the raw header and the additional complexity, and so I'm glad you honed in on it so we could explore how best to simplify this.
>
> I did consider extending `IPSK_cnts`, and what you describe ought to work //functionally// just fine, but the reason I opted not to do that was because of considerations for efficient memory utilization at link time (I expect MC/DC will be getting the most utilization for embedded, where this is most important).  With the alignment of `IPSK_cnts` being driven by the size of the counters (up to 64bits, as you know, and for good reason), even 1 byte of additional MC/DC data at the end of a function's `IPSK_cnts` would yield large gaps during linker aggregation of these sections across functions due to alignment constraints.  With one of the goals behind MC/DC being to avoid unnecessary memory consumption, it didn't make sense philosophically to have anything related to //memory efficient// bitmap bits be impacted by the alignment and size of the counters, which, at least conceptually, is an altogether different thing.  Keeping the bitmaps in a separate, 1-byte aligned section `IPSK_bits` yields the most granularity and flexibility.  I tried to abstract a lot of the overlap for how the sections are created as much as possible.
>
> With respect to modifying the raw header, I agree there is reluctance to changing this, and if there is a way to derive the needed information without having to do this, I would definitely consider it.

Yes, putting everything in the `IPSK_cnts` section would lead to more size overhead. In that case, I'm fine with the new `IPSK_bitmap` section. Thanks for clarifying!

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138846/new/

https://reviews.llvm.org/D138846