[llvm-dev] RFC: EfficiencySanitizer
Derek Bruening via llvm-dev
llvm-dev at lists.llvm.org
Mon Apr 18 11:02:46 PDT 2016
On Mon, Apr 18, 2016 at 1:36 PM, Craig, Ben via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On 4/17/2016 4:46 PM, Derek Bruening via llvm-dev wrote:
> *Cache fragmentation*: this tool gather data structure field hotness
> information, looking for data layout optimization opportunities by grouping
> hot fields together to avoid data cache fragmentation. Future enhancements
> may add field affinity information if it can be computed with low enough
> I can imagine vaguely imagine how this data would be acquired, but I'm
> more interested in what analysis is provided by the tool, and how this
> information would be presented to a user. Would it be a flat list of
> classes, sorted by number of accesses, with each field annotated by number
> of accesses? Or is there some other kind of presentation planned? Maybe
> some kind of weighting for classes with frequent cache misses?
The sorting/filtering metric will include disparity between fields: hot
fields interleaved with cold fields are what it's looking for, with a total
access count high enough to matter. Yes, it would present to the user the
field layout with annotations for access count as you suggest.
> *Working set measurement*: this tool measures the data working set size of
> an application at each snapshot during execution. It can help to
> understand phased behavior as well as providing basic direction for further
> effort by the developer: e.g., knowing whether the working set is close to
> fitting in current L3 caches or is many times larger can help determine
> where to spend effort.
> I think my questions here are basically the reverse of my prior
> questions. I can imagine the presentation ( a graph with time on the X
> axis, working set measurement on the Y axis, with some markers highlighting
> key execution points). I'm not sure how the data collection works though,
> or even really what is being measured. Are you planning on counting the
> number of data bytes / data cache lines used during each time period? For
> the purposes of this tool, when is data brought into the working set and
> when is data evicted from the working set?
The tool records which data cache lines were touched at least once during a
snapshot (basically just setting a shadow memory bit for each load/store).
The metadata is cleared after each snapshot is recorded so that the next
snapshot starts with a blank slate. Snapshots can be combined via logical
or as the execution time grows to adaptively handle varying total execution
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev