<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Apr 18, 2016 at 1:36 PM, Craig, Ben via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class=""><br>
<div>On 4/17/2016 4:46 PM, Derek Bruening
via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>*Cache fragmentation*: this tool gather data structure
field hotness information, looking for data layout
optimization opportunities by grouping hot fields together to
avoid data cache fragmentation. Future enhancements may add
field affinity information if it can be computed with low
enough overhead.<br></div>
</div>
</blockquote></span>
I can imagine vaguely imagine how this data would be acquired, but
I'm more interested in what analysis is provided by the tool, and
how this information would be presented to a user. Would it be a
flat list of classes, sorted by number of accesses, with each field
annotated by number of accesses? Or is there some other kind of
presentation planned? Maybe some kind of weighting for classes with
frequent cache misses?</div></blockquote><div><br></div><div>The sorting/filtering metric will include disparity between fields: hot fields interleaved with cold fields are what it's looking for, with a total access count high enough to matter. Yes, it would present to the user the field layout with annotations for access count as you suggest.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class=""><blockquote type="cite"><div dir="ltr"><div>*Working set measurement*: this tool measures the data
working set size of an application at each snapshot during
execution. It can help to understand phased behavior as well
as providing basic direction for further effort by the
developer: e.g., knowing whether the working set is close to
fitting in current L3 caches or is many times larger can help
determine where to spend effort.</div>
</div>
</blockquote></span>
I think my questions here are basically the reverse of my prior
questions. I can imagine the presentation ( a graph with time on
the X axis, working set measurement on the Y axis, with some markers
highlighting key execution points). I'm not sure how the data
collection works though, or even really what is being measured. Are
you planning on counting the number of data bytes / data cache lines
used during each time period? For the purposes of this tool, when
is data brought into the working set and when is data evicted from
the working set?<br></div></blockquote><div><br></div><div>The tool records which data cache lines were touched at least once during a snapshot (basically just setting a shadow memory bit for each load/store). The metadata is cleared after each snapshot is recorded so that the next snapshot starts with a blank slate. Snapshots can be combined via logical or as the execution time grows to adaptively handle varying total execution time.</div></div></div></div>