[llvm-dev] A thought to improve IPRA

Fri Jul 8 11:41:47 PDT 2016

On Fri, Jul 8, 2016 at 11:46 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Jul 8, 2016, at 11:12 AM, vivek pandya <vivekvpandya at gmail.com> wrote:
>
> Hello LLVM Developers,
>
> I have a thought to improve IPRA and I would like summaries discussion on
> IRC regarding that so we can develop an idea out of that if it really helps.
>
> So idea is to have more callee saved registers at infrequently called leaf
> procedures and try provide more registers to procedures which are in upper
> region of the call graph. But as pointed out by Quentin this optimization
> may help in context of "true" IPRA but in our case we may not require this.
> But I think that it can improve performance in current IPRA. I explain both
> arguments ( Quentin's and mine) with following example.
>
> Consider following call sequence A->B->C , here C is very less time called
> leaf procedure while A is called frequently and B may call C based on some
> condition now while propagating actual register usage information from C to
> A we almost clobbered most of the registers so in this case as per
> Quentin's point we does not hurt the performance as we fall back to CC but
> I think we can improve the performance as follows:
> If we mark every register preserved by C (i.e having more spill reloads at
> procedure entry and exit ) and if this can help  at A. Suppose A requires
> more number of distinct registers than CC can provide and if not provided
> it will spill variables to memory. Now if we can provide more registers at
> A by having more spills at C then we can save spill at A which can be
> beneficial because A is frequently called but C is less frequently called
> and thus reducing total number of spill/restore in program execution.
>
> However again effect of this optimization will be limited by the scope of
> current IPRA (i.e one Module only) because we can' really propagate the
> details about more callee saved registers to caller which is defined in
> other module, but still it may helpful.
>
> Any thoughts on this ?
>
>
>
> I think it is interesting, have you considered:
>
> - the code size impact? (C will have a lot of spills)
>
Yes, this needs to be address with some heuristics based on call  frequency
to C and no of clobbers it has. Also can we say that a function which does
not have any kind of call instruction in it's body will have less clobbers ?

> - what if C is cold but all (most) of its call sites are located in
> different modules?
>
Can we user Uses to get no of call site in current module and based on that
we decide to optimize? Again some heuristics .

> - an alternative approach where we would break the CGSCC ordering to
> codegen B and A before C, so we would be able to spill minimally when
> performing the code ten for C?
>
Do you here mean marking all preserve for C while code gen for B and then
when we come to C (top-down) we may avoid some spills if C can use regs
which are not really used by B?

Also this can be applied to a function which is less frequently called and
which may not be a leaf function. It may help.

-Vivek

>
>
> —
> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160709/4844a108/attachment.html>