[llvm-dev] RFC: EfficiencySanitizer

Mon Apr 18 11:02:46 PDT 2016

On Mon, Apr 18, 2016 at 1:36 PM, Craig, Ben via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 4/17/2016 4:46 PM, Derek Bruening via llvm-dev wrote:
>
> *Cache fragmentation*: this tool gather data structure field hotness
> information, looking for data layout optimization opportunities by grouping
> hot fields together to avoid data cache fragmentation.  Future enhancements
> may add field affinity information if it can be computed with low enough
> overhead.
>
> I can imagine vaguely imagine how this data would be acquired, but I'm
> more interested in what analysis is provided by the tool, and how this
> information would be presented to a user.  Would it be a flat list of
> classes, sorted by number of accesses, with each field annotated by number
> of accesses?  Or is there some other kind of presentation planned?  Maybe
> some kind of weighting for classes with frequent cache misses?
>

The sorting/filtering metric will include disparity between fields: hot
fields interleaved with cold fields are what it's looking for, with a total
access count high enough to matter.  Yes, it would present to the user the
field layout with annotations for access count as you suggest.

> *Working set measurement*: this tool measures the data working set size of
> an application at each snapshot during execution.  It can help to
> understand phased behavior as well as providing basic direction for further
> effort by the developer: e.g., knowing whether the working set is close to
> fitting in current L3 caches or is many times larger can help determine
> where to spend effort.
>
> I think my questions here are basically the reverse of my prior
> questions.  I can imagine the presentation ( a graph with time on the X
> axis, working set measurement on the Y axis, with some markers highlighting
> key execution points).  I'm not sure how the data collection works though,
> or even really what is being measured.  Are you planning on counting the
> number of data bytes / data cache lines used during each time period?  For
> the purposes of this tool, when is data brought into the working set and
> when is data evicted from the working set?
>

The tool records which data cache lines were touched at least once during a
snapshot (basically just setting a shadow memory bit for each load/store).
The metadata is cleared after each snapshot is recorded so that the next
snapshot starts with a blank slate.  Snapshots can be combined via logical
or as the execution time grows to adaptively handle varying total execution
time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160418/8d9588c0/attachment.html>