[llvm-dev] [GSoC 2016] [Weekly Status] Interprocedural Register Allocation

vivek pandya via llvm-dev llvm-dev at lists.llvm.org
Sun Jun 19 12:24:04 PDT 2016


Dear Professor,

Thanks to bring this to notice, I tried out a simple test case with
indirect function call:

int foo() {
return 12;
}

int bar(int a) {
return foo() + a;
}

int (*fp)() = 0;
int (*fp1)(int) = 0;

int main() {
fp = foo;
fp();
fp1 = bar;
fp1(15);
return 0;
}

and currently IPRA skips optimization for indirect call. But I think this
can be handled and I will inform you if I update implementation to cover
this. Currently IPRA uses Function * to hold register usage information
across the passes, so my hunch is that if  from the call instruction for
the indirect call, Function * can be derived then it should be able to
handle indirection function calls for procedures defined in a current
module.

Sincerely,
Vivek

On Sun, Jun 19, 2016 at 4:59 PM, vivek pandya <vivekvpandya at gmail.com>
wrote:

> Dear Community,
>
> Please find summary of work done during this week as follow:
>
> Implementation:
> ============
>
> During this week we have identified a bug in IPRA due to not considering
> RegMask of function calls in given machine function. The same bug on
> AArch64 has been reported by Chad Rosier and more detailed description can
> be found at https://llvm.org/bugs/show_bug.cgi?id=28144 . To fix this bug
> RegMask calculation have been modified to consider RegMask of function call
> in a Machine Function. The patch is here http://reviews.llvm.org/D21395.
>
> AsmPrinter.cpp is modified to print call preserved registers in comments
> at call site in generated assembly file. This suggestion was by Quentin
> Colombet to improve readability of asm files while experimenting RegMask
> and calling convention etc. This simple patch can be found here
> http://reviews.llvm.org/D21490.
>
> We have also experimented a simple improvement to IPRA by setting callee
> saved registers to none for local function and we have found some
> performance improvement.
>
> Testing:
> ======
>
> After bug 28144 fix there is no runtime failures in test suite. Also due
> to bug 28144 there was about 60 run time failures and total time taken for
> test suite compilation was 30% more compare to with out IPRA. After bug fix
> with IPRA total compile time improvement compare to without IPRA is about 4
> to 8 minutes.
>
>
> Study:
>
> =====
>
> This week I study code responsible for adding spill and restore for callee
> saved registers. Also studied how calling convention is defined in target
> specific .td files. I studied AsmPrinter.cpp and specifically
> emitComments() method which is responsible for adding comments in llvm
> generated assembly files. I also studied about some linkage type in LLMV IR
> like ‘internal’ which represent local function in module.
>
>
> Plan for next week:
>
> 1) Submit patch related to local function optimization for review
>
> 2) Find more possible improvements
>
> 3) Get active patches committed
>
> 4) Compile large software with IPRA enabled
>
>
> Sincerely,
>
> Vivek
>
> On Wed, Jun 15, 2016 at 8:45 AM, vivek pandya <vivekvpandya at gmail.com>
> wrote:
>
>>
>>
>> On Wed, Jun 15, 2016 at 8:40 AM, vivek pandya <vivekvpandya at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Jun 15, 2016 at 6:16 AM, Quentin Colombet <qcolombet at apple.com>
>>> wrote:
>>>
>>>> Hi Vivek,
>>>>
>>>> How much of the slow down on runtime comes from the different layout of
>>>> the function in the asm file? (I.e., because of the dummy scc pass.)
>>>>
>>>> Hello Quentin,
>>>
>>> Please do not consider previous results as there was a major bug in
>>> RegMask calculation due to not considering RegMasks of callee in MF body
>>> while calculating register usage information, that has been fixed now ( as
>>> discussed with Matthias Braun and Mehdi Amini ) and after this bugfix I
>>> have run test-suite with and without IPRA.  Yes there is runtime slow down
>>> for some test cases ranging from 1% to 64% similarly compile time slow down
>>> is ranging from 1% to 48%. The runtime performance improvement is ranging
>>> from 1% to 35% and surprisingly there is also compile time improvement in a
>>> range from 1% to 60% . I would request you to go through complete results
>>> at
>>> https://docs.google.com/document/d/1cavn-POrZdhw-rrdPXV8mSvyppvOWs2rxmLgaOnd6KE/edit?usp=sharing
>>>
>>> In above result baseline is IPRA and current is without IPRA. So
>> actually data with background red is actual improvement and green is
>> regression.
>> -Vivek
>>
>>> Also there is not extra failure due to IPRA now so in the result above I
>>> have removed failures.
>>>
>>> Sincerely,
>>> Vivek
>>>
>>>
>>>> Cheers,
>>>> Q
>>>>
>>>> Le 11 juin 2016 à 21:49, vivek pandya via llvm-dev <
>>>> llvm-dev at lists.llvm.org> a écrit :
>>>>
>>>> Dear Community,
>>>>
>>>> The patch for Interprocedural Register Allocation has been committed
>>>> now , thanks to Mehdi Amini for that. We would like you to play with it and
>>>> let us know your views and more importantly ideas to improve it.
>>>>
>>>> The test-suite run has indicated some non trivial issue that results in
>>>> run time failure of the programs, we will be investigating it more. Here
>>>> are some stats :
>>>>
>>>> test-suite has been tested with IPRA enabled and overall results are
>>>> not much encouraging. On average 30% increase in compile time. Many
>>>> programs have also increase in execution time ( average 20%) that is really
>>>> serious concern for us. About 60 tests have failed on run time this
>>>> indicates error in compilation. how ever 3 tests have improvement in their
>>>> runtime and that is 7% average.
>>>>
>>>>
>>>> This week I think good thing for me to learn is to setup llvm
>>>> development environment properly other wise one can end up wasting too much
>>>> time building the llvm it self.
>>>>
>>>> So here is brief summary:
>>>> Implementation:
>>>> ============
>>>>
>>>> The patch has been split into analysis and transformation passes. The
>>>> pass responsible for register usage propagation has been made target
>>>> independent.  A print method and command line option -print-regusage has
>>>> been added so that RegMaks details can be printed in Release builds also,
>>>> this enables lit test case to be testable in Release build too. Other minor
>>>> changes to adhere coding and naming conventions.
>>>>
>>>>
>>>> Testing:
>>>>
>>>> ======
>>>>
>>>> test-suite has been tested with IPRA enabled.
>>>>
>>>>
>>>> Study and other:
>>>>
>>>> =============
>>>>
>>>> Learned about LNT, test-suite for LLVM, Inline assembly in LLVM IR,
>>>> fastcc, local functions, MCStream class. In C++ I leaned about emplace
>>>> family of methods in STL and perfect forwarding introduced in C++11.
>>>>
>>>>
>>>> Plan for next week:
>>>>
>>>> 1) Investigate issue related to functional correctness that leads to
>>>> run time failures
>>>>
>>>> 2) profile the compilation process to verify increase in time due to
>>>> IPRA
>>>>
>>>> 3) Improve IPRA by instructing codegen to not save register for local
>>>> function.
>>>>
>>>> 4) Make the pass emit asm comments to indicate register clobbered by
>>>> function call at call site in generated ASM file.
>>>>
>>>>
>>>> Sincerely,
>>>>
>>>> Vivek
>>>>
>>>> On Sun, Jun 5, 2016 at 8:48 AM, vivek pandya <vivekvpandya at gmail.com>
>>>> wrote:
>>>>
>>>>> Dear Community,
>>>>>
>>>>> This week I got my patch reviewed by mentors and based on that I have
>>>>> done changes. Also we have identified a problem with callee-saved registers
>>>>> being marked as clobbered registers so we fixed that bug. I have described
>>>>> other minor changes in following section.
>>>>>
>>>>> It was expected to get the patch committed by end of this week but due
>>>>> to unexpected mistake I was not able to complete writing test cases. Sorry
>>>>> for that.
>>>>> I had build llvm with ipra enable by default and that build files were
>>>>> on my path ! Due to that next time I tried to build llvm it was terribly
>>>>> slow  (almost 1 hour for 10% build ). I spend to much time on fixing this
>>>>> by playing around with environment variables, cmake options etc.
>>>>> But I think this is a serious concern, we need to think verify this
>>>>> time complexity other wise building a large software with IPRA enable would
>>>>> be very time consuming.
>>>>>
>>>>> The toughest part for this week was to get lit and FileCheck work as
>>>>> you expect them to work, specially when analysis pass prints info on stdio
>>>>> and there is also a output file generated by llc or opt command.
>>>>>
>>>>> So here is brief summary :
>>>>>
>>>>> Implementation:
>>>>> ============
>>>>>
>>>>> RegUsageInfoCollector is now Calling Convention aware so that RegMask
>>>>> does not mark callee saved register as clobbered register. Due to this
>>>>> register allocator can use callee saved register for caller.
>>>>> PhysicalRegisterUsageInfo.cpp renamed to RegisterUsageInfo.cpp.
>>>>> StringMap used in RegisterUsageInfo.cpp is replaced by DenseMap of
>>>>> GlobalVariable * to RegMask.
>>>>> DummyCGSCCPass moved from TargetPassConfig.cpp to CallGraphSCCPass.h.
>>>>> Minor correction in comments, changes to adhere coding standards of
>>>>> LLVM.
>>>>>
>>>>> Testing:
>>>>> =====
>>>>>
>>>>> The above mentioned changes has been tested with SNU-Realtime
>>>>> benchmarks.
>>>>>
>>>>> Studied lit and FileCheck tool and written simple test to verify
>>>>> functionality of coding.
>>>>>
>>>>>
>>>>> Study and other:
>>>>>
>>>>> ============
>>>>>
>>>>> Studied some examples of lit compatible llvm IR with comments to RUN
>>>>> test cases, FileCheck tool syntax and how to use it with in lit
>>>>> infrastructure.
>>>>>
>>>>> I also understand X86 calling convention in more details.
>>>>>
>>>>> I also studied basic concepts in llvm IR language while reading .ll
>>>>> files written for lit.
>>>>>
>>>>> I learned about rvalue references and move semantics introduced in
>>>>> C++11.
>>>>>
>>>>>
>>>>> Plan for next week:
>>>>>
>>>>> 1) Get the patch committed along with proper tets cases.
>>>>>
>>>>> 2) Analyse time complexity of the approach.
>>>>>
>>>>> 3) Make target specific pass to CodeGen as it seems it is not required
>>>>> to be target specific.
>>>>>
>>>>> 4) If possible build a large application with ipra enable and analyze
>>>>> the impact.
>>>>>
>>>>>
>>>>> Sincerely,
>>>>>
>>>>> Vivek
>>>>>
>>>>>
>>>>> On Sat, May 28, 2016 at 7:31 PM, vivek pandya <vivekvpandya at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear community,
>>>>>>
>>>>>> This is to brief you the progress of Interprocedural Register
>>>>>> Allocation, for those who are interested to see the progress in terms of
>>>>>> code please consider http://reviews.llvm.org/D20769
>>>>>> This patch contains simple infrastructure to propagate register usage
>>>>>> information of callee to caller in call graph. The code generation order is
>>>>>> changed to follow bottom up order on call graph , Thanks to Mehdi Amini for
>>>>>> the patch for the same !  I will write a blog on this very soon.
>>>>>>
>>>>>> So during this week as per the schedule proposed it should be study
>>>>>> related infrastructure in LLVM and finalizing an approach for IPRA, but
>>>>>> instead I am able to implement a working (may not be fully correct)
>>>>>> prototype because I have used community bonding period to discuss and learn
>>>>>> related stuffs from the mentors and also due to patch for CodeGen
>>>>>> reordering was provided by dear mentor Mehdi Amini.
>>>>>>
>>>>>> So I conclude the work done during this week as follows:
>>>>>> Implementation :
>>>>>> ============
>>>>>> Following passes have been implemented during this week: An immutable
>>>>>> pass to store competed RegMask, a  machine function pass that iterates
>>>>>> through each registers and check if it is used or not and based on that
>>>>>> details create a RegMask and a target specific machine function pass that
>>>>>> uses the RegMask created by second pass and propagates information by
>>>>>> updating call instructions RegMask. To update the RegMask of MI ,
>>>>>> setRegMask() function has been added to MachineOperand, a command line
>>>>>> option -enable-ipra and debug type -debug-only=“ipra" has been added to
>>>>>> control the optimization through llc.
>>>>>>
>>>>>> Testing:
>>>>>> =====
>>>>>> The above mentioned implementation has been tested over SNU-Real-Time
>>>>>> benchmark suit (http://www.cprover.org/goto-cc/examples/snu.html)
>>>>>> and some simple programs that uses library function ( for a library
>>>>>> function register allocation is not done by LLVM so this optimization will
>>>>>> simply skip them)
>>>>>>
>>>>>> Study and Other:
>>>>>> =============
>>>>>> I have learned following things in LLVM, how it stores reg clobbering
>>>>>> information? how it is used by Reg allocators through LivePhysRegs,
>>>>>> LiveRegMatrix and other related passes? How to schedule a pass using
>>>>>> TargetPassConfig and TargetMachine? What are called callee saved registers?
>>>>>> What is an Immutable Pass? Apart from that I have also learned how to use
>>>>>> phabricator to send review request. I have also read some related
>>>>>> literatures.
>>>>>>
>>>>>> During this week though task was to schedule the passes in proper
>>>>>> order so that dependencies of related passes are satisfied.
>>>>>>
>>>>>> Plan for next week:
>>>>>> 1) Perform more testing and debug any known issue
>>>>>> 2) Fine ture the implementation so as to eliminate any unnecessary
>>>>>> work
>>>>>> 3) During the testing from the stats I have observed that IPRA does
>>>>>> not always improve the work of IntraProcedural register allocators and it
>>>>>> is also observer that the amount of benefit (in terms of spilled live
>>>>>> ranges ) is not deterministic. So I would like to find reasons for this
>>>>>> behavior.
>>>>>> 4) Start implementing target specific pass for other targets if
>>>>>> review passes properly with no major bugs.
>>>>>>
>>>>>> Please provide any feedback/suggestion including for format of this
>>>>>> email.
>>>>>>
>>>>>> I would also like to thanks my mentors Mehdi Amini , Hal Finkel, Quentin
>>>>>> Colombet, Matthias Braun and other community members for providing
>>>>>> quick help every time when I asked ( I have got replies even after 8 PM (
>>>>>> PDT) ! ) .
>>>>>>
>>>>>> Sincerely,
>>>>>> Vivek
>>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160620/8393fc3c/attachment.html>


More information about the llvm-dev mailing list