[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation

Tue Jun 14 20:15:46 PDT 2016

On Wed, Jun 15, 2016 at 8:40 AM, vivek pandya <vivekvpandya at gmail.com>
wrote:

>
>
> On Wed, Jun 15, 2016 at 6:16 AM, Quentin Colombet <qcolombet at apple.com>
> wrote:
>
>> Hi Vivek,
>>
>> How much of the slow down on runtime comes from the different layout of
>> the function in the asm file? (I.e., because of the dummy scc pass.)
>>
>> Hello Quentin,
>
> Please do not consider previous results as there was a major bug in
> RegMask calculation due to not considering RegMasks of callee in MF body
> while calculating register usage information, that has been fixed now ( as
> discussed with Matthias Braun and Mehdi Amini ) and after this bugfix I
> have run test-suite with and without IPRA.  Yes there is runtime slow down
> for some test cases ranging from 1% to 64% similarly compile time slow down
> is ranging from 1% to 48%. The runtime performance improvement is ranging
> from 1% to 35% and surprisingly there is also compile time improvement in a
> range from 1% to 60% . I would request you to go through complete results
> at
> https://docs.google.com/document/d/1cavn-POrZdhw-rrdPXV8mSvyppvOWs2rxmLgaOnd6KE/edit?usp=sharing
>
> In above result baseline is IPRA and current is without IPRA. So actually
data with background red is actual improvement and green is regression.
-Vivek

> Also there is not extra failure due to IPRA now so in the result above I
> have removed failures.
>
> Sincerely,
> Vivek
>
>
>> Cheers,
>> Q
>>
>> Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev <
>> llvm-dev at lists.llvm.org> a écrit :
>>
>> Dear Community,
>>
>> The patch for Interprocedural Register Allocation has been committed now
>> , thanks to Mehdi Amini for that. We would like you to play with it and let
>> us know your views and more importantly ideas to improve it.
>>
>> The test-suite run has indicated some non trivial issue that results in
>> run time failure of the programs, we will be investigating it more. Here
>> are some stats :
>>
>> test-suite has been tested with IPRA enabled and overall results are not
>> much encouraging. On average 30% increase in compile time. Many programs
>> have also increase in execution time ( average 20%) that is really serious
>> concern for us. About 60 tests have failed on run time this indicates error
>> in compilation. how ever 3 tests have improvement in their runtime and that
>> is 7% average.
>>
>>
>> This week I think good thing for me to learn is to setup llvm development
>> environment properly other wise one can end up wasting too much time
>> building the llvm it self.
>>
>> So here is brief summary:
>> Implementation:
>> ============
>>
>> The patch has been split into analysis and transformation passes. The
>> pass responsible for register usage propagation has been made target
>> independent.  A print method and command line option -print-regusage has
>> been added so that RegMaks details can be printed in Release builds also,
>> this enables lit test case to be testable in Release build too. Other minor
>> changes to adhere coding and naming conventions.
>>
>>
>> Testing:
>>
>> ======
>>
>> test-suite has been tested with IPRA enabled.
>>
>>
>> Study and other:
>>
>> =============
>>
>> Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR,
>> fastcc, local functions, MCStream class. In C++ I leaned about emplace
>> family of methods in STL and perfect forwarding introduced in C++11.
>>
>>
>> Plan for next week:
>>
>> 1) Investigate issue related to functional correctness that leads to run
>> time failures
>>
>> 2) profile the compilation process to verify increase in time due to IPRA
>>
>> 3) Improve IPRA by instructing codegen to not save register for local
>> function.
>>
>> 4) Make the pass emit asm comments to indicate register clobbered by
>> function call at call site in generated ASM file.
>>
>>
>> Sincerely,
>>
>> Vivek
>>
>> On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com>
>> wrote:
>>
>>> Dear Community,
>>>
>>> This week I got my patch reviewed by mentors and based on that I have
>>> done changes. Also we have identified a problem with callee-saved registers
>>> being marked as clobbered registers so we fixed that bug. I have described
>>> other minor changes in following section.
>>>
>>> It was expected to get the patch committed by end of this week but due
>>> to unexpected mistake I was not able to complete writing test cases. Sorry
>>> for that.
>>> I had build llvm with ipra enable by default and that build files were
>>> on my path ! Due to that next time I tried to build llvm it was terribly
>>> slow  (almost 1 hour for 10% build ). I spend to much time on fixing this
>>> by playing around with environment variables, cmake options etc.
>>> But I think this is a serious concern, we need to think verify this time
>>> complexity other wise building a large software with IPRA enable would be
>>> very time consuming.
>>>
>>> The toughest part for this week was to get lit and FileCheck work as you
>>> expect them to work, specially when analysis pass prints info on stdio and
>>> there is also a output file generated by llc or opt command.
>>>
>>> So here is brief summary :
>>>
>>> Implementation:
>>> ============
>>>
>>> RegUsageInfoCollector is now Calling Convention aware so that RegMask
>>> does not mark callee saved register as clobbered register. Due to this
>>> register allocator can use callee saved register for caller.
>>> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp.
>>> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of
>>> GlobalVariable * to RegMask.
>>> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h.
>>> Minor correction in comments, changes to adhere coding standards of LLVM.
>>>
>>> Testing:
>>> =====
>>>
>>> The above mentioned changes has been tested with SNU-Realtime benchmarks.
>>>
>>> Studied lit and FileCheck tool and written simple test to verify
>>> functionality of coding.
>>>
>>>
>>> Study and other:
>>>
>>> ============
>>>
>>> Studied some examples of lit compatible llvm IR with comments to RUN
>>> test cases, FileCheck tool syntax and how to use it with in lit
>>> infrastructure.
>>>
>>> I also understand X86 calling convention in more details.
>>>
>>> I also studied basic concepts in llvm IR language while reading .ll
>>> files written for lit.
>>>
>>> I learned about rvalue references and move semantics introduced in C++11.
>>>
>>>
>>> Plan for next week:
>>>
>>> 1) Get the patch committed along with proper tets cases.
>>>
>>> 2) Analyse time complexity of the approach.
>>>
>>> 3) Make target specific pass to CodeGen as it seems it is not required
>>> to be target specific.
>>>
>>> 4) If possible build a large application with ipra enable and analyze
>>> the impact.
>>>
>>>
>>> Sincerely,
>>>
>>> Vivek
>>>
>>>
>>> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com>
>>> wrote:
>>>
>>>> Dear community,
>>>>
>>>> This is to brief you the progress of Interprocedural Register
>>>> Allocation, for those who are interested to see the progress in terms of
>>>> code please consider http://reviews.llvm.org/D20769
>>>> This patch contains simple infrastructure to propagate register usage
>>>> information of callee to caller in call graph. The code generation order is
>>>> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for
>>>> the patch for the same !  I will write a blog on this very soon.
>>>>
>>>> So during this week as per the schedule proposed it should be study
>>>> related infrastructure in LLVM and finalizing an approach for IPRA, but
>>>> instead I am able to implement a working (may not be fully correct)
>>>> prototype because I have used community bonding period to discuss and learn
>>>> related stuffs from the mentors and also due to patch for CodeGen
>>>> reordering was provided by dear mentor Mehdi Amini.
>>>>
>>>> So I conclude the work done during this week as follows:
>>>> Implementation :
>>>> ============
>>>> Following passes have been implemented during this week: An immutable
>>>> pass to store competed RegMask, a  machine function pass that iterates
>>>> through each registers and check if it is used or not and based on that
>>>> details create a RegMask and a target specific machine function pass that
>>>> uses the RegMask created by second pass and propagates information by
>>>> updating call instructions RegMask. To update the RegMask of MI ,
>>>> setRegMask() function has been added to MachineOperand, a command line
>>>> option -enable-ipra and debug type -debug-only=“ipra" has been added to
>>>> control the optimization through llc.
>>>>
>>>> Testing:
>>>> =====
>>>> The above mentioned implementation has been tested over SNU-Real-Time
>>>> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html) and
>>>> some simple programs that uses library function ( for a library function
>>>> register allocation is not done by LLVM so this optimization will simply
>>>> skip them)
>>>>
>>>> Study and Other:
>>>> =============
>>>> I have learned following things in LLVM, how it stores reg clobbering
>>>> information? how it is used by Reg allocators through LivePhysRegs,
>>>> LiveRegMatrix and other related passes? How to schedule a pass using
>>>> TargetPassConfig and TargetMachine? What are called callee saved registers?
>>>> What is an Immutable Pass? Apart from that I have also learned how to use
>>>> phabricator to send review request. I have also read some related
>>>> literatures.
>>>>
>>>> During this week though task was to schedule the passes in proper order
>>>> so that dependencies of related passes are satisfied.
>>>>
>>>> Plan for next week:
>>>> 1) Perform more testing and debug any known issue
>>>> 2) Fine ture the implementation so as to eliminate any unnecessary work
>>>> 3) During the testing from the stats I have observed that IPRA does not
>>>> always improve the work of IntraProcedural register allocators and it is
>>>> also observer that the amount of benefit (in terms of spilled live ranges )
>>>> is not deterministic. So I would like to find reasons for this behavior.
>>>> 4) Start implementing target specific pass for other targets if review
>>>> passes properly with no major bugs.
>>>>
>>>> Please provide any feedback/suggestion including for format of this
>>>> email.
>>>>
>>>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin
>>>> Colombet, Matthias Braun and other community members for providing
>>>> quick help every time when I asked ( I have got replies even after 8 PM (
>>>> PDT) ! ) .
>>>>
>>>> Sincerely,
>>>> Vivek
>>>>
>>>
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160615/e7d54b6a/attachment.html>