[llvm-dev] RFC: EfficiencySanitizer

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 20 10:57:03 PDT 2016


On Tue, Apr 19, 2016 at 11:14 PM, Adam Nemet via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Interesting idea!  I understand how the bookkeeping in the tool is similar
> to some of the sanitizers but I am wondering whether that is really the
> best developer’s work-flow for such a tool.
>
> I could imagine that some of the opportunities discovered by the tool
> could be optimized automatically by the compiler (e.g. temporal loads, sw
> prefetching, partitioning the heap) so feeding this information back to the
> compiler could be highly useful.  I am wondering whether the PGO model is
> closer to what we want at the end.  The problem can also be thought of as a
> natural extension of PGO.  Besides instrumenting branches and indirect
> calls, it adds instrumentation for loads and stores.
>
> We have internally been discussing ways to use PGO for optimization
> diagnostics (a continuation of Tyler’s work, see
> http://blog.llvm.org/2014/11/loop-vectorization-diagnostics-and.html).
> The idea is to help the developer to focus in on opportunities in hot code.
> It seems that the diagnostics provided by your tools could be emitted
> directly by the analyses in LLVM during the profile-use phase.
>

You are right that some of the information are already available with PGO +
static analysis -- one example is field affinity data.  Instruction working
set data is also 'roughly' available.

However I think Esan can potentially do a better job, easier to use and be
a centralized place for getting such information.

David


>
> Adam
>
> On Apr 17, 2016, at 2:46 PM, Derek Bruening via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> TL;DR: We plan to build a suite of compiler-based dynamic instrumentation
> tools for analyzing targeted performance problems.  These tools will all
> live under a new "EfficiencySanitizer" (or "esan") sanitizer umbrella, as
> they will share significant portions of their implementations.
>
> ====================
> Motivation
> ====================
>
> Our goal is to build a suite of dynamic instrumentation tools for
> analyzing particular performance problems that are difficult to evaluate
> using other profiling methods.  Modern hardware performance counters
> provide insight into where time is spent and when micro-architectural
> events such as cache misses are occurring, but they are of limited
> effectiveness for contextual analysis: it is not easy to answer *why* a
> cache miss occurred.
>
> Examples of tools that we have planned include: identifying wasted or
> redundant computation, identifying cache fragmentation, and measuring
> working sets.  See more details on these below.
>
> ====================
> Approach
> ====================
>
> We believe that tools with overhead beyond about 5x are simply too
> heavyweight to easily apply to large, industrial-sized applications running
> real-world workloads.  Our goal is for our tools to gather useful
> information with overhead less than 5x, and ideally closer to 3x, to
> facilitate deployment.  We would prefer to trade off accuracy and build a
> less-accurate tool below our overhead ceiling than to build a high-accuracy
> but slow tool.  We hope to hit a sweet spot of tools that gather
> trace-based contextual information not feasible with pure sampling yet are
> still practical to deploy.
>
> In a similar vein, we would prefer a targeted tool that analyzes one
> particular aspect of performance with low overhead than a more general tool
> that can answer more questions but has high overhead.
>
> Dynamic binary instrumentation is one option for these types of tools, but
> typically compiler-based instrumentation provides better performance, and
> we intend to focus only on analyzing applications for which source code is
> available.  Studying instruction cache behavior with compiler
> instrumentation can be challenging, however, so we plan to at least
> initially focus on data performance.
>
> Many of our planned tools target specific performance issues with data
> accesses.  They employ the technique of *shadow memory* to store metadata
> about application data references, using the compiler to instrument loads
> and stores with code to update the shadow memory.  A companion runtime
> library intercepts libc calls if necessary to update shadow memory on
> non-application data references.  The runtime library also intercepts heap
> allocations and other key events in order to perform its analyses.  This is
> all very similar to how existing sanitizers such as AddressSanitizer,
> ThreadSanitizer, MemorySanitizer, etc. operate today.
>
> ====================
> Example Tools
> ====================
>
> We have several initial tools that we plan to build.  These are not
> necessarily novel ideas on their own: some of these have already been
> explored in academia.  The idea is to create practical, low-overhead,
> robust, and publicly available versions of these tools.
>
> *Cache fragmentation*: this tool gather data structure field hotness
> information, looking for data layout optimization opportunities by grouping
> hot fields together to avoid data cache fragmentation.  Future enhancements
> may add field affinity information if it can be computed with low enough
> overhead.
>
> *Working set measurement*: this tool measures the data working set size of
> an application at each snapshot during execution.  It can help to
> understand phased behavior as well as providing basic direction for further
> effort by the developer: e.g., knowing whether the working set is close to
> fitting in current L3 caches or is many times larger can help determine
> where to spend effort.
>
> *Dead store detection*: this tool identifies dead stores
> (write-after-write patterns with no intervening read) as well as redundant
> stores (writes of the same value already in memory).  Xref the Deadspy
> paper from CGO 2012.
>
> *Single-reference*: this tool identifies data cache lines brought in but
> only read once.  These could be candidates for non-temporal loads.
>
> ====================
> EfficiencySanitizer
> ====================
>
> We are proposing the name EfficiencySanitizer, or "esan" for short, to
> refer to this suite of dynamic instrumentation tools for improving program
> efficiency.  As we have a number of different tools that share quite a bit
> of their implementation we plan to consider them sub-tools under the
> EfficiencySanitizer umbrella, rather than adding a whole bunch of separate
> instrumentation and runtime library components.
>
> While these tools are not addressing correctness issues like other
> sanitizers, they will be sharing a lot of the existing sanitizer runtime
> library support code.  Furthermore, users are already familiar with the
> sanitizer brand, and it seems better to extend that concept rather than add
> some new term.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160420/e963d85b/attachment.html>


More information about the llvm-dev mailing list