[LLVMdev] GSoC Proposal: Profiling Enhancements
Evan Cheng
evan.cheng at apple.com
Tue May 1 17:56:07 PDT 2012
Hi Alastair,
I am interested in seeing this through. Do you have a GSoC mentor yet?
Evan
On Apr 5, 2012, at 11:18 AM, Alastair Murray wrote:
> Hello Everyone,
>
> Before I get started I just want to sincerely apologise for not getting
> feedback on this earlier. I've had an extremely busy week as I was
> presenting a paper at the CGO conference. If anyone is able to provide
> feedback in such a short time-frame then it will be gratefully received.
> If not, then I just hope the work described sounds useful. I have
> already submitted this proposal due to the short time until the
> deadline, but I can make changes. (By the way, no need to worry about
> me being so busy during the coding period, see below.)
>
>
>
> Proposal
> --------
>
> === Overview ===
>
> LLVM already contains a profiling framework, but only a handful of
> transforms make use of the metadata. Further, it even contains a path
> profiling framework, but no transforms make use of it. This "Google
> Summer of Code" proposal lays out an achievable plan to enhance
> profiling in LLVM and to use profiling metadata in key transformations
> where it can have a strong positive effect.
>
>
>
> === Why this is Useful for LLVM ===
>
> Many profiling-data friendly transformations in LLVM do not currently
> use that data. Specific examples are loop-unrolling, loop-unswitching
> and inlining that can all find good performance improvements, though at
> the cost of increased code-size. Using profiling data with these three
> transformations would allow faster code in hot areas by applying these
> more aggressively and smaller code cold areas where minimising
> instruction cache cold-hits is the key performance concern.
>
> Additionally, the path profiling framework was added to LLVM in January
> 2011 but hasn't been touched since (except for wide-spread API updates).
> As no transforms use the the produced metadata the code never gets
> tested (though the mailing lists suggest a small number of people may
> still be using it for external purposes). This code is in serious
> danger of suffering from "bit-rot". This proposal suggests enhancing a
> transformation to use this information to improve code-quality, and as a
> side-effect this stale code will be testable. Superblock formation
> already has a basic implementation in LLVM (it is called
> Tail-Duplication) but it does not use profiling information. It can be
> improved by using either basic profiling information (for small gains)
> or path profiling information (for larger gains).
>
> The aim of this proposal is that it should result in code which really
> will be integrated into LLVM. Scanning a small subset of previous
> "Google Summer of Code" efforts (for many projects, not just LLVM) it
> appeared that few projects were actually able to use the contributed
> code. Often students would do a good job, but would not quite truly
> complete the work. Once summer was over no-one had time to polish the
> code, so it would be left unused. With this in mind I am proposing an
> achievable but useful project which is decomposable into related
> sub-goals. Time has been left in the proposal plan to produce code of
> high enough quality to be committed to the LLVM repository, and to have
> time to find sensible default parameters for heuristics that are
> changed. Also, risk is reduced as the project is decomposable into
> seven largely independent sub-goals, even if one aspect does not work,
> the other six should still be usable by the LLVM project.
>
>
>
> === About Me ===
>
> I am a PhD student at the University of Edinburgh, in the UK. I am very
> close to the end of my studies and have a few months of time available
> that line-up perfectly with "Google Summer of Code" (I won't describe
> why I have the time here, but feel free to ask me via private email or
> IM if you're curious). During my PhD I produced seven papers (five as
> first author) based on research built on the GCC and SUIF compilers. I
> have over four years of full-time experience working with compilers, and
> another two years during my undergraduate degree where compilers were
> part of my studies. My research has focussed on both code-generation
> and middle-end transformations (my publications:
> http://homepages.inf.ed.ac.uk/s0233454/pubs.html). In the past I have
> also undertaken internships which involved hand-coding assembly, so I've
> no shortage of experience working on optimisation.
>
> I have not worked with LLVM before, but I am genuinely very keen to do
> so. Looking at job advertisements and having just attended CGO 2012 I
> noticed at both that LLVM is now used more than GCC (required for more
> job postings, used in more papers). That is one reason why this
> proposal touches multiple transformations within LLVM, I'd like to learn
> as much as possible about LLVM while using my existing skills to produce
> something useful for the project. The opportunity to have an expert
> help mentor me through LLVM is an excellent opportunity. Hopefully my
> general compiler experience can offset my lack of LLVM experience as it
> always fun to get to use new technologies. Additionally I have worked
> on multiple projects written in C++ during my studies so the programming
> aspects of this proposal will not present any issues.
>
> Contact information:
> * CUT -- This is in the official submission *
>
>
>
> === Further Proposal Details ===
>
> All of the below used http://llvm.org/OpenProjects.html and
> http://nondot.org/sabre/LLVMNotes/ as a starting point.
>
> The plan is to become familiar with working within the LLVM
> infrastructure during the "Bonding Period". The best way to do that, of
> course, is to get stuck in, so I plan to add a few small benchmark items
> to "test-suite". One of the above links says that DSPStone and UTDSP
> should be added. As I have worked with both of these before I will add
> these to "test-suite". The process of doing so is probably the best way
> to learn how the LLVM testing infrastructure works. UTDSP, however, is
> non-free for commercial use, so I will only be able to add that as an
> external benchmark (like SPEC).
>
> Further to this, the path-profiling pass currently in LLVM does not seem
> to have a test-case in "test", so I will create one to aid me in
> learning how verification works in LLVM.
>
> Once the coding period begins I will first want to start working with
> the profiling infrastructure directly. I will modify the profiler so
> that run-time instrumentation is not inserted for loops with iterations
> counts that are known at compile time. This will be added to
> OptimalEdgeProfiling.cpp, and assuming it has a very-low compile-time
> overhead it will also be added to EdgeProfiling.cpp. Test-cases will be
> created for "test", and compile-time overhead and run-time benefits will
> be measured using "test-suite".
>
> After this three transforms will be modified to exploit profiling data:
> loop unrolling (LoopUnrollPass.cpp), unswitching (LoopUnswitch.cpp) and
> inlining (Inliner.cpp). This will be done by modifying the cost
> heuristics (e.g. in LoopUnroll::runOnLoop or Inliner::shouldInline).
> Initially this will be done by raising the threshold which limits the
> transformation on hot code, and lowering it on cold code (changing both
> so perhaps the overall produced code size will not change much). If
> required (or if there is time) more sophisticated heuristics will be
> evaluated, e.g. using the caller/callee site execution ratio in the
> inliner. Attempts will also be made to try and keep profiling data
> correct after transformations, so e.g. a hot loop with a heavily biased
> conditional branch inside it can be unswitched and have only one side
> unrolled. Test-cases will be added to "test" designed so that identical
> loops and functions exist but with different execution counts, so
> differing behaviour on the hot/cold samples should be observed.
> Performance will be evaluated using "test-suite" again, though with one
> addition. To ensure that the produced code is not over-specialised to
> the training profiling data an additional set of data will be used.
> MiDatasets (http://ctuning.org/wiki/index.php/CTools:CBench:Downloads)
> provides alternatives inputs for the MiBench suite, so that will be used
> to test this.
>
> The next large chunk of work will be to make use of the exisiting path
> profiling code (PathProfiling.cpp). Superblock Formation will be done
> via Tail Duplication (already implemented in TailDuplication.cpp).
> Basic profiling data should provide small gains for this, but the true
> expected gains come from using path-profiling to create the superblock
> such that there is a "trace" through it. For the hot case should result
> in good instruction locality, good branch-prediction behaviour and good
> scheduling for the "trace". Path-profiling may need to be fixed or
> enhanced for this, as it is currently unused it is difficult to know how
> well it will work. (Note: I'm stating that it is unused based on the
> fact that nothing includes PathProfileInfo.h except the path profiling
> itself — it is, however, possible that tools outside the core llvm use
> it though I couldn't find anything by means of an internet search). This
> will be tested and evaluated in the same way as the previously modified
> transformations.
>
> In all the above cases the modifications will be made with the aim that
> when profiling data is not available then compiler behaviour will remain
> unchanged from its current state. This will be verified by ensuring
> identical binaries are produced for "llvm-suite" (LLVM-head vs my local
> LLVM without profiling).
>
> The final changes to be made will be to "llvm-prof". It has been noted
> on the mailing list that it is not currently compatible with path-based
> profiling, so support for this will be added. Finally, overall
> performance evaluations and code-cleanups will be performed with the aim
> of having the code integrated into LLVM. This will hopefully be done
> with the help of the llvm-commits mailing list.
>
>
>
> === Deliverables ===
>
> * Additions to test-suite.
> * Reduced profile instrumentation run-time overhead.
> * Profile enhanced loop-unrolling.
> * Profile enhanced loop-unswitching.
> * Profile enhanced inlining.
> * Tested (and fixed if required) path profiling.
> * Profile and path profile enhanced tail-duplication
> (superblock formation).
> * llvm-prof fixed to work with path profiling.
>
>
>
> === Timeline ===
>
> During "Bonding Period":
> * Install/Setup LLVM (and Clang etc.) and run "test".
> * Install "test-suite" and run with profiling.
> * Re-read textbooks/papers concerning profiling, path-profiling and
> relevant transformations.
> * Read any relevant LLVM documentation.
> * Add DSPStone to test-suite. (I have worked with it before.)
> * Add UTDSP to test-suite as an "External" benchmark suite
> (like SPEC). (I have also worked with this before.)
> * Write test for path-profiling (there is currently not one
> in "test").
>
> Week 1: Eliminate profiling instrumentation for loops where
> the iteration count is known at compile time.
> Weeks 2-5: Have the heuristics for loop unrolling, unswitching
> and inlining use profiling information. Add MiDatasets
> to test-suite (this may not be suitable for the
> repository though).
> Weeks 6-9: Enhance Tail-Duplication to use first basic profiling
> and then path profiling. Test and fix path profiling.
> Week 10: Fix llvm-prof to work with path-profiling.
> Week 11: Performance analysis and code-cleanup.
> Week 12: Separate code into sets of independent patches for review
> on llvm-commits mailing list. Write final reports.
>
> Some aspects of weeks 12 may well be ready for review earlier than week
> 12, but I don't want to over-complicate the timeline.
>
> Thank you for taking the time to read this,
> Alatair Murray.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list