[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation

Sun Jun 19 10:26:48 PDT 2016

Hi Vivek,

I have one question (and I apologize if I missed this in your previous messages): Do you handle, or do you expect to handle, indirect function calls?  If so, how exactly are you going about doing that?  

For context, I’m interested because I’ve been working a set of passes to do profile-based devirtualization and IPO, and I’m wondering if this pass could benefit from that.  Thanks,

-—Vikram

// Vikram S. Adve
// Professor, Department of Computer Science
// University of Illinois at Urbana-Champaign
// vadve at illinois.edu
// http://llvm.org

> On Jun 19, 2016, at 7:46 AM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Date: Sun, 19 Jun 2016 16:59:27 +0530
> From: vivek pandya via llvm-dev <llvm-dev at lists.llvm.org>
> To: Quentin Colombet <qcolombet at apple.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>, Matthias Braun
> 	<matze at braunis.de>
> Subject: Re: [llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural
> 	Register	Allocation
> Message-ID:
> 	<CAHYgpoL+gZmmyTzujXhfcM6ikbaGZFFJj0sVtrmpgbvAT+nAoQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Dear Community,
> 
> Please find summary of work done during this week as follow:
> 
> Implementation:
> ============
> 
> During this week we have identified a bug in IPRA due to not considering
> RegMask of function calls in given machine function. The same bug on
> AArch64 has been reported by Chad Rosier and more detailed description can
> be found at https://llvm.org/bugs/show_bug.cgi?id=28144 . To fix this bug
> RegMask calculation have been modified to consider RegMask of function call
> in a Machine Function. The patch is here http://reviews.llvm.org/D21395.
> 
> AsmPrinter.cpp is modified to print call preserved registers in comments at
> call site in generated assembly file. This suggestion was by Quentin
> Colombet to improve readability of asm files while experimenting RegMask
> and calling convention etc. This simple patch can be found here
> http://reviews.llvm.org/D21490.
> 
> We have also experimented a simple improvement to IPRA by setting callee
> saved registers to none for local function and we have found some
> performance improvement.
> 
> Testing:
> ======
> 
> After bug 28144 fix there is no runtime failures in test suite. Also due to
> bug 28144 there was about 60 run time failures and total time taken for
> test suite compilation was 30% more compare to with out IPRA. After bug fix
> with IPRA total compile time improvement compare to without IPRA is about 4
> to 8 minutes.
> 
> 
> Study:
> 
> =====
> 
> This week I study code responsible for adding spill and restore for callee
> saved registers. Also studied how calling convention is defined in target
> specific .td files. I studied AsmPrinter.cpp and specifically
> emitComments() method which is responsible for adding comments in llvm
> generated assembly files. I also studied about some linkage type in LLMV IR
> like ‘internal’ which represent local function in module.
> 
> 
> Plan for next week:
> 
> 1) Submit patch related to local function optimization for review
> 
> 2) Find more possible improvements
> 
> 3) Get active patches committed
> 
> 4) Compile large software with IPRA enabled
> 
> 
> Sincerely,
> 
> Vivek
> 
> On Wed, Jun 15, 2016 at 8:45 AM, vivek pandya <vivekvpandya at gmail.com>
> wrote:
> 
>> 
>> 
>> On Wed, Jun 15, 2016 at 8:40 AM, vivek pandya <vivekvpandya at gmail.com>
>> wrote:
>> 
>>> 
>>> 
>>> On Wed, Jun 15, 2016 at 6:16 AM, Quentin Colombet <qcolombet at apple.com>
>>> wrote:
>>> 
>>>> Hi Vivek,
>>>> 
>>>> How much of the slow down on runtime comes from the different layout of
>>>> the function in the asm file? (I.e., because of the dummy scc pass.)
>>>> 
>>>> Hello Quentin,
>>> 
>>> Please do not consider previous results as there was a major bug in
>>> RegMask calculation due to not considering RegMasks of callee in MF body
>>> while calculating register usage information, that has been fixed now ( as
>>> discussed with Matthias Braun and Mehdi Amini ) and after this bugfix I
>>> have run test-suite with and without IPRA.  Yes there is runtime slow down
>>> for some test cases ranging from 1% to 64% similarly compile time slow down
>>> is ranging from 1% to 48%. The runtime performance improvement is ranging
>>> from 1% to 35% and surprisingly there is also compile time improvement in a
>>> range from 1% to 60% . I would request you to go through complete results
>>> at
>>> https://docs.google.com/document/d/1cavn-POrZdhw-rrdPXV8mSvyppvOWs2rxmLgaOnd6KE/edit?usp=sharing
>>> 
>>> In above result baseline is IPRA and current is without IPRA. So actually
>> data with background red is actual improvement and green is regression.
>> -Vivek
>> 
>>> Also there is not extra failure due to IPRA now so in the result above I
>>> have removed failures.
>>> 
>>> Sincerely,
>>> Vivek
>>> 
>>> 
>>>> Cheers,
>>>> Q
>>>> 
>>>> Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev <
>>>> llvm-dev at lists.llvm.org> a écrit :
>>>> 
>>>> Dear Community,
>>>> 
>>>> The patch for Interprocedural Register Allocation has been committed now
>>>> , thanks to Mehdi Amini for that. We would like you to play with it and let
>>>> us know your views and more importantly ideas to improve it.
>>>> 
>>>> The test-suite run has indicated some non trivial issue that results in
>>>> run time failure of the programs, we will be investigating it more. Here
>>>> are some stats :
>>>> 
>>>> test-suite has been tested with IPRA enabled and overall results are not
>>>> much encouraging. On average 30% increase in compile time. Many programs
>>>> have also increase in execution time ( average 20%) that is really serious
>>>> concern for us. About 60 tests have failed on run time this indicates error
>>>> in compilation. how ever 3 tests have improvement in their runtime and that
>>>> is 7% average.
>>>> 
>>>> 
>>>> This week I think good thing for me to learn is to setup llvm
>>>> development environment properly other wise one can end up wasting too much
>>>> time building the llvm it self.
>>>> 
>>>> So here is brief summary:
>>>> Implementation:
>>>> ============
>>>> 
>>>> The patch has been split into analysis and transformation passes. The
>>>> pass responsible for register usage propagation has been made target
>>>> independent.  A print method and command line option -print-regusage has
>>>> been added so that RegMaks details can be printed in Release builds also,
>>>> this enables lit test case to be testable in Release build too. Other minor
>>>> changes to adhere coding and naming conventions.
>>>> 
>>>> 
>>>> Testing:
>>>> 
>>>> ======
>>>> 
>>>> test-suite has been tested with IPRA enabled.
>>>> 
>>>> 
>>>> Study and other:
>>>> 
>>>> =============
>>>> 
>>>> Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR,
>>>> fastcc, local functions, MCStream class. In C++ I leaned about emplace
>>>> family of methods in STL and perfect forwarding introduced in C++11.
>>>> 
>>>> 
>>>> Plan for next week:
>>>> 
>>>> 1) Investigate issue related to functional correctness that leads to run
>>>> time failures
>>>> 
>>>> 2) profile the compilation process to verify increase in time due to IPRA
>>>> 
>>>> 3) Improve IPRA by instructing codegen to not save register for local
>>>> function.
>>>> 
>>>> 4) Make the pass emit asm comments to indicate register clobbered by
>>>> function call at call site in generated ASM file.
>>>> 
>>>> 
>>>> Sincerely,
>>>> 
>>>> Vivek
>>>> 
>>>> On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com>
>>>> wrote:
>>>> 
>>>>> Dear Community,
>>>>> 
>>>>> This week I got my patch reviewed by mentors and based on that I have
>>>>> done changes. Also we have identified a problem with callee-saved registers
>>>>> being marked as clobbered registers so we fixed that bug. I have described
>>>>> other minor changes in following section.
>>>>> 
>>>>> It was expected to get the patch committed by end of this week but due
>>>>> to unexpected mistake I was not able to complete writing test cases. Sorry
>>>>> for that.
>>>>> I had build llvm with ipra enable by default and that build files were
>>>>> on my path ! Due to that next time I tried to build llvm it was terribly
>>>>> slow  (almost 1 hour for 10% build ). I spend to much time on fixing this
>>>>> by playing around with environment variables, cmake options etc.
>>>>> But I think this is a serious concern, we need to think verify this
>>>>> time complexity other wise building a large software with IPRA enable would
>>>>> be very time consuming.
>>>>> 
>>>>> The toughest part for this week was to get lit and FileCheck work as
>>>>> you expect them to work, specially when analysis pass prints info on stdio
>>>>> and there is also a output file generated by llc or opt command.
>>>>> 
>>>>> So here is brief summary :
>>>>> 
>>>>> Implementation:
>>>>> ============
>>>>> 
>>>>> RegUsageInfoCollector is now Calling Convention aware so that RegMask
>>>>> does not mark callee saved register as clobbered register. Due to this
>>>>> register allocator can use callee saved register for caller.
>>>>> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp.
>>>>> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of
>>>>> GlobalVariable * to RegMask.
>>>>> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h.
>>>>> Minor correction in comments, changes to adhere coding standards of
>>>>> LLVM.
>>>>> 
>>>>> Testing:
>>>>> =====
>>>>> 
>>>>> The above mentioned changes has been tested with SNU-Realtime
>>>>> benchmarks.
>>>>> 
>>>>> Studied lit and FileCheck tool and written simple test to verify
>>>>> functionality of coding.
>>>>> 
>>>>> 
>>>>> Study and other:
>>>>> 
>>>>> ============
>>>>> 
>>>>> Studied some examples of lit compatible llvm IR with comments to RUN
>>>>> test cases, FileCheck tool syntax and how to use it with in lit
>>>>> infrastructure.
>>>>> 
>>>>> I also understand X86 calling convention in more details.
>>>>> 
>>>>> I also studied basic concepts in llvm IR language while reading .ll
>>>>> files written for lit.
>>>>> 
>>>>> I learned about rvalue references and move semantics introduced in
>>>>> C++11.
>>>>> 
>>>>> 
>>>>> Plan for next week:
>>>>> 
>>>>> 1) Get the patch committed along with proper tets cases.
>>>>> 
>>>>> 2) Analyse time complexity of the approach.
>>>>> 
>>>>> 3) Make target specific pass to CodeGen as it seems it is not required
>>>>> to be target specific.
>>>>> 
>>>>> 4) If possible build a large application with ipra enable and analyze
>>>>> the impact.
>>>>> 
>>>>> 
>>>>> Sincerely,
>>>>> 
>>>>> Vivek
>>>>> 
>>>>> 
>>>>> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Dear community,
>>>>>> 
>>>>>> This is to brief you the progress of Interprocedural Register
>>>>>> Allocation, for those who are interested to see the progress in terms of
>>>>>> code please consider http://reviews.llvm.org/D20769
>>>>>> This patch contains simple infrastructure to propagate register usage
>>>>>> information of callee to caller in call graph. The code generation order is
>>>>>> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for
>>>>>> the patch for the same !  I will write a blog on this very soon.
>>>>>> 
>>>>>> So during this week as per the schedule proposed it should be study
>>>>>> related infrastructure in LLVM and finalizing an approach for IPRA, but
>>>>>> instead I am able to implement a working (may not be fully correct)
>>>>>> prototype because I have used community bonding period to discuss and learn
>>>>>> related stuffs from the mentors and also due to patch for CodeGen
>>>>>> reordering was provided by dear mentor Mehdi Amini.
>>>>>> 
>>>>>> So I conclude the work done during this week as follows:
>>>>>> Implementation :
>>>>>> ============
>>>>>> Following passes have been implemented during this week: An immutable
>>>>>> pass to store competed RegMask, a  machine function pass that iterates
>>>>>> through each registers and check if it is used or not and based on that
>>>>>> details create a RegMask and a target specific machine function pass that
>>>>>> uses the RegMask created by second pass and propagates information by
>>>>>> updating call instructions RegMask. To update the RegMask of MI ,
>>>>>> setRegMask() function has been added to MachineOperand, a command line
>>>>>> option -enable-ipra and debug type -debug-only=“ipra" has been added to
>>>>>> control the optimization through llc.
>>>>>> 
>>>>>> Testing:
>>>>>> =====
>>>>>> The above mentioned implementation has been tested over SNU-Real-Time
>>>>>> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and
>>>>>> some simple programs that uses library function ( for a library function
>>>>>> register allocation is not done by LLVM so this optimization will simply
>>>>>> skip them)
>>>>>> 
>>>>>> Study and Other:
>>>>>> =============
>>>>>> I have learned following things in LLVM, how it stores reg clobbering
>>>>>> information? how it is used by Reg allocators through LivePhysRegs,
>>>>>> LiveRegMatrix and other related passes? How to schedule a pass using
>>>>>> TargetPassConfig and TargetMachine? What are called callee saved registers?
>>>>>> What is an Immutable Pass? Apart from that I have also learned how to use
>>>>>> phabricator to send review request. I have also read some related
>>>>>> literatures.
>>>>>> 
>>>>>> During this week though task was to schedule the passes in proper
>>>>>> order so that dependencies of related passes are satisfied.
>>>>>> 
>>>>>> Plan for next week:
>>>>>> 1) Perform more testing and debug any known issue
>>>>>> 2) Fine ture the implementation so as to eliminate any unnecessary work
>>>>>> 3) During the testing from the stats I have observed that IPRA does
>>>>>> not always improve the work of IntraProcedural register allocators and it
>>>>>> is also observer that the amount of benefit (in terms of spilled live
>>>>>> ranges ) is not deterministic. So I would like to find reasons for this
>>>>>> behavior.
>>>>>> 4) Start implementing target specific pass for other targets if review
>>>>>> passes properly with no major bugs.
>>>>>> 
>>>>>> Please provide any feedback/suggestion including for format of this
>>>>>> email.
>>>>>> 
>>>>>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin
>>>>>> Colombet, Matthias Braun and other community members for providing
>>>>>> quick help every time when I asked ( I have got replies even after 8 PM (
>>>>>> PDT) ! ) .
>>>>>> 
>>>>>> Sincerely,
>>>>>> Vivek
>>>>>> 
>>>>> 
>>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> 
>>>> 
>>> 
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160619/178770c2/attachment-0001.html>