[llvm-dev] RFC: EfficiencySanitizer

Craig, Ben via llvm-dev llvm-dev at lists.llvm.org
Mon Apr 18 10:36:27 PDT 2016

This sounds interesting.  I've got a couple of questions about the cache 
fragmentation tool and the working set measurement tool.

On 4/17/2016 4:46 PM, Derek Bruening via llvm-dev wrote:
> TL;DR: We plan to build a suite of compiler-based dynamic 
> instrumentation tools for analyzing targeted performance problems.  
> These tools will all live under a new "EfficiencySanitizer" (or 
> "esan") sanitizer umbrella, as they will share significant portions of 
> their implementations.
> ====================
> Motivation
> ====================
> Our goal is to build a suite of dynamic instrumentation tools for 
> analyzing particular performance problems that are difficult to 
> evaluate using other profiling methods.  Modern hardware performance 
> counters provide insight into where time is spent and when 
> micro-architectural events such as cache misses are occurring, but 
> they are of limited effectiveness for contextual analysis: it is not 
> easy to answer *why* a cache miss occurred.
> Examples of tools that we have planned include: identifying wasted or 
> redundant computation, identifying cache fragmentation, and measuring 
> working sets.  See more details on these below.
> ====================
> Approach
> ====================
> We believe that tools with overhead beyond about 5x are simply too 
> heavyweight to easily apply to large, industrial-sized applications 
> running real-world workloads. Our goal is for our tools to gather 
> useful information with overhead less than 5x, and ideally closer to 
> 3x, to facilitate deployment.  We would prefer to trade off accuracy 
> and build a less-accurate tool below our overhead ceiling than to 
> build a high-accuracy but slow tool.  We hope to hit a sweet spot of 
> tools that gather trace-based contextual information not feasible with 
> pure sampling yet are still practical to deploy.
> In a similar vein, we would prefer a targeted tool that analyzes one 
> particular aspect of performance with low overhead than a more general 
> tool that can answer more questions but has high overhead.
> Dynamic binary instrumentation is one option for these types of tools, 
> but typically compiler-based instrumentation provides better 
> performance, and we intend to focus only on analyzing applications for 
> which source code is available. Studying instruction cache behavior 
> with compiler instrumentation can be challenging, however, so we plan 
> to at least initially focus on data performance.
> Many of our planned tools target specific performance issues with data 
> accesses.  They employ the technique of *shadow memory* to store 
> metadata about application data references, using the compiler to 
> instrument loads and stores with code to update the shadow memory.  A 
> companion runtime library intercepts libc calls if necessary to update 
> shadow memory on non-application data references.  The runtime library 
> also intercepts heap allocations and other key events in order to 
> perform its analyses.  This is all very similar to how existing 
> sanitizers such as AddressSanitizer, ThreadSanitizer, MemorySanitizer, 
> etc. operate today.
> ====================
> Example Tools
> ====================
> We have several initial tools that we plan to build.  These are not 
> necessarily novel ideas on their own: some of these have already been 
> explored in academia.  The idea is to create practical, low-overhead, 
> robust, and publicly available versions of these tools.
> *Cache fragmentation*: this tool gather data structure field hotness 
> information, looking for data layout optimization opportunities by 
> grouping hot fields together to avoid data cache fragmentation.  
> Future enhancements may add field affinity information if it can be 
> computed with low enough overhead.
I can imagine vaguely imagine how this data would be acquired, but I'm 
more interested in what analysis is provided by the tool, and how this 
information would be presented to a user.  Would it be a flat list of 
classes, sorted by number of accesses, with each field annotated by 
number of accesses?  Or is there some other kind of presentation 
planned?  Maybe some kind of weighting for classes with frequent cache 

> *Working set measurement*: this tool measures the data working set 
> size of an application at each snapshot during execution.  It can help 
> to understand phased behavior as well as providing basic direction for 
> further effort by the developer: e.g., knowing whether the working set 
> is close to fitting in current L3 caches or is many times larger can 
> help determine where to spend effort.
I think my questions here are basically the reverse of my prior 
questions.  I can imagine the presentation ( a graph with time on the X 
axis, working set measurement on the Y axis, with some markers 
highlighting key execution points).  I'm not sure how the data 
collection works though, or even really what is being measured.  Are you 
planning on counting the number of data bytes / data cache lines used 
during each time period?  For the purposes of this tool, when is data 
brought into the working set and when is data evicted from the working set?

> *Dead store detection*: this tool identifies dead stores 
> (write-after-write patterns with no intervening read) as well as 
> redundant stores (writes of the same value already in memory).  Xref 
> the Deadspy paper from CGO 2012.
> *Single-reference*: this tool identifies data cache lines brought in 
> but only read once.  These could be candidates for non-temporal loads.
> ====================
> EfficiencySanitizer
> ====================
> We are proposing the name EfficiencySanitizer, or "esan" for short, to 
> refer to this suite of dynamic instrumentation tools for improving 
> program efficiency.  As we have a number of different tools that share 
> quite a bit of their implementation we plan to consider them sub-tools 
> under the EfficiencySanitizer umbrella, rather than adding a whole 
> bunch of separate instrumentation and runtime library components.
> While these tools are not addressing correctness issues like other 
> sanitizers, they will be sharing a lot of the existing sanitizer 
> runtime library support code.  Furthermore, users are already familiar 
> with the sanitizer brand, and it seems better to extend that concept 
> rather than add some new term.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160418/339e79db/attachment.html>

More information about the llvm-dev mailing list