[llvm-dev] RFC: EfficiencySanitizer
Xinliang David Li via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 20 11:13:14 PDT 2016
On Wed, Apr 20, 2016 at 11:00 AM, Adam Nemet <anemet at apple.com> wrote:
>
> On Apr 20, 2016, at 10:42 AM, Xinliang David Li via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
> On Tue, Apr 19, 2016 at 11:19 PM, Sean Silva via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Some of this data might be interesting for profile guidance. Are there
>> any plans there?
>>
>>
> Esan instrumentation is geared toward application level tuning by
> developers -- the data collected here are not quite 'actionable' by the
> compiler directly. For instance, struct field reordering needs whole
> program analysis and can be very tricky to do for C++ code with complex
> inheritances (e.g, best base class field order may depending on the
> context of the inheritance). Fancy struct layout changes such as peeling,
> splitting/outlining, field inlining etc also requires very good address
> escape analysis etc. Dead store detection can be used indirectly by the
> compiler -- compiler certainly can not use the information to prove
> statically the stores are dead, but the compiler developer can use this
> tool to find the cases and figure out missing optimizations in the compiler.
>
>
> The compiler can also use the dead store information directly. If you look
> at the dead stores in the hmmer in the DeadSpy paper (section 5.2), we now
> get this with LLVM after http://reviews.llvm.org/D16712 because the loop
> is multi-versioned to disambiguate may-aliasing accesses. Essentially
> anytime we ask the user to add restrict we have a chance to do that
> automatically with versioning given enough confidence based on the profile
> information.
>
By using 'directly' -- it should mean that the compiler takes the feedback
data, and directly eliminate the found 'dead' store. Compiler changes or
user annotations still needed in order for the compiler to prove the store
is dead statically. That is why I said it is used 'indirectly' by the
compiler (developers).
>
> Data working set size data is also not quite usable by the compiler.
>
> Esan's design tradeoffs are also not the same as PGO. The former allows
> more overhead and is less restricted -- it can go deeper.
>
>
> Well but that can just be different profiling levels that the user
> specifies upfront.
>
By PGO, what I meant is the default PGO mode. The key point is that Esan
does not have to follow the same design considerations.
David
>
> Adam
>
>
> David
>
>
>
>> -- Sean Silva
>>
>> On Sun, Apr 17, 2016 at 2:46 PM, Derek Bruening via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> TL;DR: We plan to build a suite of compiler-based dynamic
>>> instrumentation tools for analyzing targeted performance problems. These
>>> tools will all live under a new "EfficiencySanitizer" (or "esan") sanitizer
>>> umbrella, as they will share significant portions of their implementations.
>>>
>>> ====================
>>> Motivation
>>> ====================
>>>
>>> Our goal is to build a suite of dynamic instrumentation tools for
>>> analyzing particular performance problems that are difficult to evaluate
>>> using other profiling methods. Modern hardware performance counters
>>> provide insight into where time is spent and when micro-architectural
>>> events such as cache misses are occurring, but they are of limited
>>> effectiveness for contextual analysis: it is not easy to answer *why* a
>>> cache miss occurred.
>>>
>>> Examples of tools that we have planned include: identifying wasted or
>>> redundant computation, identifying cache fragmentation, and measuring
>>> working sets. See more details on these below.
>>>
>>> ====================
>>> Approach
>>> ====================
>>>
>>> We believe that tools with overhead beyond about 5x are simply too
>>> heavyweight to easily apply to large, industrial-sized applications running
>>> real-world workloads. Our goal is for our tools to gather useful
>>> information with overhead less than 5x, and ideally closer to 3x, to
>>> facilitate deployment. We would prefer to trade off accuracy and build a
>>> less-accurate tool below our overhead ceiling than to build a high-accuracy
>>> but slow tool. We hope to hit a sweet spot of tools that gather
>>> trace-based contextual information not feasible with pure sampling yet are
>>> still practical to deploy.
>>>
>>> In a similar vein, we would prefer a targeted tool that analyzes one
>>> particular aspect of performance with low overhead than a more general tool
>>> that can answer more questions but has high overhead.
>>>
>>> Dynamic binary instrumentation is one option for these types of tools,
>>> but typically compiler-based instrumentation provides better performance,
>>> and we intend to focus only on analyzing applications for which source code
>>> is available. Studying instruction cache behavior with compiler
>>> instrumentation can be challenging, however, so we plan to at least
>>> initially focus on data performance.
>>>
>>> Many of our planned tools target specific performance issues with data
>>> accesses. They employ the technique of *shadow memory* to store metadata
>>> about application data references, using the compiler to instrument loads
>>> and stores with code to update the shadow memory. A companion runtime
>>> library intercepts libc calls if necessary to update shadow memory on
>>> non-application data references. The runtime library also intercepts heap
>>> allocations and other key events in order to perform its analyses. This is
>>> all very similar to how existing sanitizers such as AddressSanitizer,
>>> ThreadSanitizer, MemorySanitizer, etc. operate today.
>>>
>>> ====================
>>> Example Tools
>>> ====================
>>>
>>> We have several initial tools that we plan to build. These are not
>>> necessarily novel ideas on their own: some of these have already been
>>> explored in academia. The idea is to create practical, low-overhead,
>>> robust, and publicly available versions of these tools.
>>>
>>> *Cache fragmentation*: this tool gather data structure field hotness
>>> information, looking for data layout optimization opportunities by grouping
>>> hot fields together to avoid data cache fragmentation. Future enhancements
>>> may add field affinity information if it can be computed with low enough
>>> overhead.
>>>
>>> *Working set measurement*: this tool measures the data working set size
>>> of an application at each snapshot during execution. It can help to
>>> understand phased behavior as well as providing basic direction for further
>>> effort by the developer: e.g., knowing whether the working set is close to
>>> fitting in current L3 caches or is many times larger can help determine
>>> where to spend effort.
>>>
>>> *Dead store detection*: this tool identifies dead stores
>>> (write-after-write patterns with no intervening read) as well as redundant
>>> stores (writes of the same value already in memory). Xref the Deadspy
>>> paper from CGO 2012.
>>>
>>> *Single-reference*: this tool identifies data cache lines brought in but
>>> only read once. These could be candidates for non-temporal loads.
>>>
>>> ====================
>>> EfficiencySanitizer
>>> ====================
>>>
>>> We are proposing the name EfficiencySanitizer, or "esan" for short, to
>>> refer to this suite of dynamic instrumentation tools for improving program
>>> efficiency. As we have a number of different tools that share quite a bit
>>> of their implementation we plan to consider them sub-tools under the
>>> EfficiencySanitizer umbrella, rather than adding a whole bunch of separate
>>> instrumentation and runtime library components.
>>>
>>> While these tools are not addressing correctness issues like other
>>> sanitizers, they will be sharing a lot of the existing sanitizer runtime
>>> library support code. Furthermore, users are already familiar with the
>>> sanitizer brand, and it seems better to extend that concept rather than add
>>> some new term.
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160420/351150e3/attachment.html>
More information about the llvm-dev
mailing list