[LLVMdev] GSoC Proposal: Profiling Enhancements

Tue May 1 17:56:07 PDT 2012

Hi Alastair,

I am interested in seeing this through. Do you have a GSoC mentor yet?

Evan

On Apr 5, 2012, at 11:18 AM, Alastair Murray wrote:

> Hello Everyone,
> 
> Before I get started I just want to sincerely apologise for not getting 
> feedback on this earlier.  I've had an extremely busy week as I was 
> presenting a paper at the CGO conference.  If anyone is able to provide 
> feedback in such a short time-frame then it will be gratefully received. 
>  If not, then I just hope the work described sounds useful.  I have 
> already submitted this proposal due to the short time until the 
> deadline, but I can make changes.  (By the way, no need to worry about 
> me being so busy during the coding period, see below.)
> 
> 
> 
> Proposal
> --------
> 
> === Overview ===
> 
> LLVM already contains a profiling framework, but only a handful of 
> transforms make use of the metadata.  Further, it even contains a path 
> profiling framework, but no transforms make use of it.  This "Google 
> Summer of Code" proposal lays out an achievable plan to enhance 
> profiling in LLVM and to use profiling metadata in key transformations 
> where it can have a strong positive effect.
> 
> 
> 
> === Why this is Useful for LLVM ===
> 
> Many profiling-data friendly transformations in LLVM do not currently 
> use that data.  Specific examples are loop-unrolling, loop-unswitching 
> and inlining that can all find good performance improvements, though at 
> the cost of increased code-size.  Using profiling data with these three 
> transformations would allow faster code in hot areas by applying these 
> more aggressively and smaller code cold areas where minimising 
> instruction cache cold-hits is the key performance concern.
> 
> Additionally, the path profiling framework was added to LLVM in January 
> 2011 but hasn't been touched since (except for wide-spread API updates). 
>  As no transforms use the the produced metadata the code never gets 
> tested (though the mailing lists suggest a small number of people may 
> still be using it for external purposes).  This code is in serious 
> danger of suffering from "bit-rot".  This proposal suggests enhancing a 
> transformation to use this information to improve code-quality, and as a 
> side-effect this stale code will be testable. Superblock formation 
> already has a basic implementation in LLVM (it is called 
> Tail-Duplication) but it does not use profiling information.  It can be 
> improved by using either basic profiling information (for small gains) 
> or path profiling information (for larger gains).
> 
> The aim of this proposal is that it should result in code which really 
> will be integrated into LLVM.  Scanning a small subset of previous 
> "Google Summer of Code" efforts (for many projects, not just LLVM) it 
> appeared that few projects were actually able to use the contributed 
> code.  Often students would do a good job, but would not quite truly 
> complete the work.  Once summer was over no-one had time to polish the 
> code, so it would be left unused.  With this in mind I am proposing an 
> achievable but useful project which is decomposable into related 
> sub-goals.  Time has been left in the proposal plan to produce code of 
> high enough quality to be committed to the LLVM repository, and to have 
> time to find sensible default parameters for heuristics that are 
> changed.  Also, risk is reduced as the project is decomposable into 
> seven largely independent sub-goals, even if one aspect does not work, 
> the other six should still be usable by the LLVM project.
> 
> 
> 
> === About Me ===
> 
> I am a PhD student at the University of Edinburgh, in the UK.  I am very 
> close to the end of my studies and have a few months of time available 
> that line-up perfectly with "Google Summer of Code" (I won't describe 
> why I have the time here, but feel free to ask me via private email or 
> IM if you're curious).  During my PhD I produced seven papers (five as 
> first author) based on research built on the GCC and SUIF compilers.  I 
> have over four years of full-time experience working with compilers, and 
> another two years during my undergraduate degree where compilers were 
> part of my studies.  My research has focussed on both code-generation 
> and middle-end transformations (my publications: 
> http://homepages.inf.ed.ac.uk/s0233454/pubs.html).  In the past I have 
> also undertaken internships which involved hand-coding assembly, so I've 
> no shortage of experience working on optimisation.
> 
> I have not worked with LLVM before, but I am genuinely very keen to do 
> so.  Looking at job advertisements and having just attended CGO 2012 I 
> noticed at both that LLVM is now used more than GCC (required for more 
> job postings, used in more papers).  That is one reason why this 
> proposal touches multiple transformations within LLVM, I'd like to learn 
> as much as possible about LLVM while using my existing skills to produce 
> something useful for the project.  The opportunity to have an expert 
> help mentor me through LLVM is an excellent opportunity.  Hopefully my 
> general compiler experience can offset my lack of LLVM experience as it 
> always fun to get to use new technologies.  Additionally I have worked 
> on multiple projects written in C++ during my studies so the programming 
> aspects of this proposal will not present any issues.
> 
> Contact information:
> 	* CUT -- This is in the official submission *
> 
> 
> 
> === Further Proposal Details ===
> 
> All of the below used http://llvm.org/OpenProjects.html and 
> http://nondot.org/sabre/LLVMNotes/ as a starting point.
> 
> The plan is to become familiar with working within the LLVM 
> infrastructure during the "Bonding Period".  The best way to do that, of 
> course, is to get stuck in, so I plan to add a few small benchmark items 
> to "test-suite".  One of the above links says that DSPStone and UTDSP 
> should be added.  As I have worked with both of these before I will add 
> these to "test-suite".  The process of doing so is probably the best way 
> to learn how the LLVM testing infrastructure works.  UTDSP, however, is 
> non-free for commercial use, so I will only be able to add that as an 
> external benchmark (like SPEC).
> 
> Further to this, the path-profiling pass currently in LLVM does not seem 
> to have a test-case in "test", so I will create one to aid me in 
> learning how verification works in LLVM.
> 
> Once the coding period begins I will first want to start working with 
> the profiling infrastructure directly.  I will modify the profiler so 
> that run-time instrumentation is not inserted for loops with iterations 
> counts that are known at compile time.  This will be added to 
> OptimalEdgeProfiling.cpp, and assuming it has a very-low compile-time 
> overhead it will also be added to EdgeProfiling.cpp.  Test-cases will be 
> created for "test", and compile-time overhead and run-time benefits will 
> be measured using "test-suite".
> 
> After this three transforms will be modified to exploit profiling data: 
> loop unrolling (LoopUnrollPass.cpp), unswitching (LoopUnswitch.cpp) and 
> inlining (Inliner.cpp).  This will be done by modifying the cost 
> heuristics (e.g. in LoopUnroll::runOnLoop or Inliner::shouldInline). 
> Initially this will be done by raising the threshold which limits the 
> transformation on hot code, and lowering it on cold code (changing both 
> so perhaps the overall produced code size will not change much).  If 
> required (or if there is time) more sophisticated heuristics will be 
> evaluated, e.g. using the caller/callee site execution ratio in the 
> inliner.  Attempts will also be made to try and keep profiling data 
> correct after transformations, so e.g. a hot loop with a heavily biased 
> conditional branch inside it can be unswitched and have only one side 
> unrolled.  Test-cases will be added to "test" designed so that identical 
> loops and functions exist but with different execution counts, so 
> differing behaviour on the hot/cold samples should be observed. 
> Performance will be evaluated using "test-suite" again, though with one 
> addition.  To ensure that the produced code is not over-specialised to 
> the training profiling data an additional set of data will be used. 
> MiDatasets (http://ctuning.org/wiki/index.php/CTools:CBench:Downloads) 
> provides alternatives inputs for the MiBench suite, so that will be used 
> to test this.
> 
> The next large chunk of work will be to make use of the exisiting path 
> profiling code (PathProfiling.cpp).  Superblock Formation will be done 
> via Tail Duplication (already implemented in TailDuplication.cpp). 
> Basic profiling data should provide small gains for this, but the true 
> expected gains come from using path-profiling to create the superblock 
> such that there is a "trace" through it.  For the hot case should result 
> in good instruction locality, good branch-prediction behaviour and good 
> scheduling for the "trace".  Path-profiling may need to be fixed or 
> enhanced for this, as it is currently unused it is difficult to know how 
> well it will work. (Note: I'm stating that it is unused based on the 
> fact that nothing includes PathProfileInfo.h except the path profiling 
> itself — it is, however, possible that tools outside the core llvm use 
> it though I couldn't find anything by means of an internet search). This 
> will be tested and evaluated in the same way as the previously modified 
> transformations.
> 
> In all the above cases the modifications will be made with the aim that 
> when profiling data is not available then compiler behaviour will remain 
> unchanged from its current state.  This will be verified by ensuring 
> identical binaries are produced for "llvm-suite" (LLVM-head vs my local 
> LLVM without profiling).
> 
> The final changes to be made will be to "llvm-prof". It has been noted 
> on the mailing list that it is not currently compatible with path-based 
> profiling, so support for this will be added.  Finally, overall 
> performance evaluations and code-cleanups will be performed with the aim 
> of having the code integrated into LLVM.  This will hopefully be done 
> with the help of the llvm-commits mailing list.
> 
> 
> 
> === Deliverables ===
> 
> * Additions to test-suite.
> * Reduced profile instrumentation run-time overhead.
> * Profile enhanced loop-unrolling.
> * Profile enhanced loop-unswitching.
> * Profile enhanced inlining.
> * Tested (and fixed if required) path profiling.
> * Profile and path profile enhanced tail-duplication
>   (superblock formation).
> * llvm-prof fixed to work with path profiling.
> 
> 
> 
> === Timeline ===
> 
> During "Bonding Period":
> 	* Install/Setup LLVM (and Clang etc.) and run "test".
> 	* Install "test-suite" and run with profiling.
> 	* Re-read textbooks/papers concerning profiling, path-profiling and 
> relevant transformations.
> 	* Read any relevant LLVM documentation.
> 	* Add DSPStone to test-suite. (I have worked with it before.)
> 	* Add UTDSP to test-suite as an "External" benchmark suite
>           (like SPEC). (I have also worked with this before.)
> 	* Write test for path-profiling (there is currently not one
>           in "test").
> 
> Week 1:  Eliminate profiling instrumentation for loops where
>          the iteration count is known at compile time.
> Weeks 2-5:  Have the heuristics for loop unrolling, unswitching
>             and inlining use profiling information. Add MiDatasets
>             to test-suite (this may not be suitable for the
>             repository though).
> Weeks 6-9: Enhance Tail-Duplication to use first basic profiling
>            and then path profiling.  Test and fix path profiling.
> Week 10: Fix llvm-prof to work with path-profiling.
> Week 11: Performance analysis and code-cleanup.
> Week 12: Separate code into sets of independent patches for review
>          on llvm-commits mailing list. Write final reports.
> 
> Some aspects of weeks 12 may well be ready for review earlier than week 
> 12, but I don't want to over-complicate the timeline.
> 
> Thank you for taking the time to read this,
> Alatair Murray.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev