[llvm-dev] IPRA, interprocedural register allocation, question

vivek pandya via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 8 21:26:08 PDT 2016


On Sat, Jul 9, 2016 at 8:15 AM, Lawrence, Peter <c_plawre at qca.qualcomm.com>
wrote:

> Vivek,
>
>            IIUC it seems that we need two pieces of information to do IPRA,
>
> 1. what registers the callee clobbers
>
> 2. what the callee does to the call-graph
>
Yes I think this is enough, but in your case we don't require #2

>
>
> And it is #2 that we are missing when we define an external function,
>
> Even when we declare it with a preserves or a regmask attribute,
>
>
>
Because I think  once we have effect of attribute at IR/MI level then we
can just parse it and populate register usage information vector for
declared function and then we can propagate reg mask on each call site
encountered.
But I am not user will it be easy to get new attribute working or we may
need to hack clang for that too.

I would also like to have thoughts from my mentors (Mehdi Amini and Hal
Finkel) about this.

> So what I / we need is another attribute that says this is a leaf function,
>
> At least in my case all I’m really concerned with are leaf functions
>
>
>
I am stating with a simple function  declaration which have a custom
attribute.

-Vivek

>
>
> Thoughts ?
>
>
>
>
>
> --Peter Lawrence.
>
>
>
>
>
>
>
> *From:* vivek pandya [mailto:vivekvpandya at gmail.com]
> *Sent:* Friday, July 08, 2016 10:24 AM
> *To:* Lawrence, Peter <c_plawre at qca.qualcomm.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; llvm-dev-request at lists.llvm.org
> *Subject:* Re: Re:[llvm-dev] IPRA, interprocedural register allocation,
> question
>
>
>
>
>
>
>
> On Fri, Jul 8, 2016 at 1:42 PM, vivek pandya <vivekvpandya at gmail.com>
> wrote:
>
>
>
>
>
> On Fri, Jul 8, 2016 at 9:47 AM, Lawrence, Peter <c_plawre at qca.qualcomm.com>
> wrote:
>
> Vivek,
>
>              I am looking into these function attributes in the clang docs
>
>                 Preserve_most
>
>                 Preserve_all
>
> They are not available in the 3.6.2 that I am currently using, but I hope
> they exist in 3.8
>
>
>
> These should provide enough info to solve my problem,
>
> at the MC level calls to functions with these attributes
>
> with be code-gen’ed  through different “calling conventions”,
>
> and CALL instructions to them should have different register USE and DEF
> info,
>
>
>
> Yes I believe that preserve_most or preserve_all should help you even with
> out IPRA. But just to note IPRA can even help further for example on X86
> preserve_most cc will not preserve R11 (this can be verified from
> X86CallingConv.td and X86RegisterInfo.cpp) how ever IPAR calculates regmask
> based on the actual register usage and if procedure with preserve_most cc
> does not use R11 and none callsite inside of function body then IPRA will
> mark R11 as preserved. Also IPRA produces RegMask which is super set of
> RegMask due to calling convention.
>
>
>
> I believe that __attribute__ ((registermask = ....))  can provide
> more flexibility compare to preserve_all or preserve_most CC in some case.
> So believe that we should try it out.
>
>
>
> -Vivek
>
>
>
> This CALL instruction register USE and DEF info should already be useful
>
> to the intra-procedural register allocator (allowing values live across
> these
>
> calls to be in what are otherwise caller-save registers),
>
> at least that’s how I read the MC dumps, every call instruction seems to
> have
>
> every caller-save register flagged as “imp-def”, IE implicitly-defined by
> the instruction,
>
> and hopefully what is considered a caller-save register at a call-site is
> defined by the callee.
>
> And this should be the information that IPRA takes advantage of in its
> bottom-up analysis.
>
>
>
> Yes that is expected help from IPRA.
>
>
>
> Which leads me to this question, when compiling an entire whole program at
> one time,
>
> so there is no linking and no LTO, will there ever be IPRA that works
> within LLC for this scenario,
>
> and is this an objective of your project, or are you focusing only on LTO ?
>
> The current IPRA infrastructure works at compile time so it's scope of
> optimization is restricted to a compilation unit. So IPRA can only
> construct correct register usage information if the procedure's code is
> generated by same compiler instance that means we can't optimize library
> calls or procedure defined in other module. This is because we can't keep
> register usage information data across two different compiler instance.
>
>
>
> Now if we consider LTO, it eliminates above limitation by making a large
> IR module from smaller modules before generating code and thus we can have
> register usage information (at lest) for procedure which was previously
> defined in other module, because now with LTO every thing is in one module.
> So that also clarifies that IPRA does not do anything at link time.
>
>
>
> Now coming to LLC, it can use IPRA and optimize for functions defined in
> current module. So yes while compiling whole program ( a single huge .bc
> file) IPRA can be used with LLC. Also just note that if a software is
> written in separate files per module (which is very common) and still you
> want to maximize benefits of IPRA, then we can use llvm-link tool to
> combine several .bc files to produce a huge .bc file and use that with LLC
> to get maximum benefits.
>
>
>
> I know this is not the typical “linux” scenario (dynamic linking of not
> only standard libraries,
>
> but also sometimes even application libraries, and lots of static linking
> because of program
>
> size), but it is a typical “embedded” scenario, which is where I am
> currently.
>
>
>
> I don't understand this use case but we can have further improvement in
> IPRA for example if you have several libraries which has already compiled
> and codegen, but you are able to provide information of register usage for
> the functions of that libraries than we can think about an approach were we
> can store register usage information into a file (which will obviously
> increase compile time) and use that information across different compiler
> instances so that we can provide register usage information with out having
> actual code while compiling.
>
>
>
> Other thoughts or comments ?
>
>
>
> I am looking for some ideas that can improve current IPRA. So if you feel
> anything relevant please let me know we can discuss and implement feasible
> ideas.
>
>
>
> Thanks,
>
> Vivek
>
>
>
> --Peter Lawrence.
>
>
>
>
>
> *From:* vivek pandya [mailto:vivekvpandya at gmail.com]
> *Sent:* Wednesday, July 06, 2016 2:09 PM
> *To:* llvm-dev <llvm-dev at lists.llvm.org>; llvm-dev-request at lists.llvm.org;
> Lawrence, Peter <c_plawre at qca.qualcomm.com>
> *Subject:* Re:[llvm-dev] IPRA, interprocedural register allocation,
> question
>
>
>
> Hello Peter,
>
>
>
> Thanks to pointing out this interesting case.
>
> Vivek,
>           I have an application where many of the leaf functions are
> Hand-coded assembly language,  because they use special IO instructions
> That only the assembler knows about.  These functions typically don't
> Use any registers besides the incoming argument registers, IE they don't
> Need to use any additional callee-save nor caller-save registers.
>
> If inline asm template has specified clobbered list properly than IPRA is
> able to use that information and it propagates correct register mask (and
> that also means that skipping clobbers list while IPRA enabled may broke
> executable)
>
> For example in following code:
>
> int gcd( int a, int b ) {
>
>     int result ;
>
>     /* Compute Greatest Common Divisor using Euclid's Algorithm */
>
>     __asm__ __volatile__ ( "movl %1, %%r15d;"
>
>                           "movl %2, %%ecx;"
>
>                           "CONTD: cmpl $0, %%ecx;"
>
>                           "je DONE;"
>
>                           "xorl %%r13d, %%r13d;"
>
>                           "idivl %%ecx;"
>
>                           "movl %%ecx, %%r15d;"
>
>                           "movl %%r13d, %%ecx;"
>
>                           "jmp CONTD;"
>
>                           "DONE: movl %%r15d, %0;" : "=g" (result) : "g"
> (a), "g" (b) : "ecx" ,"r13", "r15"
>
>     );
>
>
>
>     return result ;
>
> }
>
> IPRA calculates and propagates correct regmask in which it marks CH, CL,
> ECX .. clobbered and R13, R15 is not marked clobbered as it is callee saved
> and LLVM code generators also insert spill/restores code for them.
>
>
>
> Is there any way in your IPRA interprocedural register allocation project
> that
> The user can supply this information for external functions ?
>
> By external word do you here mean function defined in other module than
> being used?  In that case as IPRA can operate on only one module at time
> register usage propagation is not possible. But there is a work around for
> this problem. You can use IPRA with link time optimization enabled because
> the way LLVM LTO works it creates a big IR modules out of source files and
> them optimize and codegen it so in that case IPRA can have actual register
> usage info (if function will be compiled in current module).
>
>
>
> In case you want to experiment with IPRA please apply
> http://reviews.llvm.org/D21395 this patch before you begin.
>
>
>
> -Vivek
>
>
>
> Perhaps using some form of __attribute__ ?
> Maybe __attribute__ ((registermask = ....))  ?
>
>
> --Peter Lawrence.
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160709/a81fc2aa/attachment.html>


More information about the llvm-dev mailing list