<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Apr 18, 2016 at 1:36 PM, Craig, Ben via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  

    

  

  <div bgcolor="#FFFFFF" text="#000000"><span class=""><br>

    <div>On 4/17/2016 4:46 PM, Derek Bruening

      via llvm-dev wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div>*Cache fragmentation*: this tool gather data structure

          field hotness information, looking for data layout

          optimization opportunities by grouping hot fields together to

          avoid data cache fragmentation.  Future enhancements may add

          field affinity information if it can be computed with low

          enough overhead.<br></div>

      </div>

    </blockquote></span>

    I can imagine vaguely imagine how this data would be acquired, but

    I'm more interested in what analysis is provided by the tool, and

    how this information would be presented to a user.  Would it be a

    flat list of classes, sorted by number of accesses, with each field

    annotated by number of accesses?  Or is there some other kind of

    presentation planned?  Maybe some kind of weighting for classes with

    frequent cache misses?</div></blockquote><div><br></div><div>The sorting/filtering metric will include disparity between fields: hot fields interleaved with cold fields are what it's looking for, with a total access count high enough to matter.  Yes, it would present to the user the field layout with annotations for access count as you suggest.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class=""><blockquote type="cite"><div dir="ltr"><div>*Working set measurement*: this tool measures the data

          working set size of an application at each snapshot during

          execution.  It can help to understand phased behavior as well

          as providing basic direction for further effort by the

          developer: e.g., knowing whether the working set is close to

          fitting in current L3 caches or is many times larger can help

          determine where to spend effort.</div>

      </div>

    </blockquote></span>

    I think my questions here are basically the reverse of my prior

    questions.  I can imagine the presentation ( a graph with time on

    the X axis, working set measurement on the Y axis, with some markers

    highlighting key execution points).  I'm not sure how the data

    collection works though, or even really what is being measured.  Are

    you planning on counting the number of data bytes / data cache lines

    used during each time period?  For the purposes of this tool, when

    is data brought into the working set and when is data evicted from

    the working set?<br></div></blockquote><div><br></div><div>The tool records which data cache lines were touched at least once during a snapshot (basically just setting a shadow memory bit for each load/store).  The metadata is cleared after each snapshot is recorded so that the next snapshot starts with a blank slate.  Snapshots can be combined via logical or as the execution time grows to adaptively handle varying total execution time.</div></div></div></div>