[LLVMdev] Separating loop nests based on profile information?

Mon Jan 19 21:19:43 PST 2015

On Mon, Jan 19, 2015 at 5:55 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
> On 01/12/2015 08:28 PM, Xinliang David Li wrote:
>
>
>
> On Thu, Jan 8, 2015 at 12:33 PM, Philip Reames <listmail at philipreames.com>
> wrote:
>
>>
>> Let's start with a toy example:
>> while(c) {
>>   x = this->x;
>>   y = this->y;
>>   if (x == y) {
>>     rare();
>>   }
>> }
>>
>>
>  With profile info, speculative PRE can be performed for memory access
> that are not downsafe:
>
> Could you define the term "downsafe"?  I think I know what you mean, but
> just to be clear.
>

see the SSAPRE paper:
http://www.cse.unsw.edu.au/~cs4133/Papers/ssapre.toplas97.pdf

>
>  temp_c = c;
> if (temp_c)
> {
>     x = this->x;
>     y = this->y;
> }
> while (temp_c) {
>    if (x == y)
>      {
>           rare();
>           x = this->x;
>           y = this->y;
>       }
>    temp_c = c;
> }
>
>  If you can prove this->x etc are safe control speculative (e.g, seen a
> dominating memory access with the same base and offset), it can be
>
>  x = this->x;
> y = this->y;
> while (c) {
>    if (x == y) {
>       rare();
>       x = this->x;
>       y = this->y;
>     }
>  }
>
> Yep.  In LLVM, this basically requires that this->FIELD be known
> deferenceable.  I filled one simple bug for this here:
> http://llvm.org/bugs/show_bug.cgi?id=22266
>

What you proposed in the bug is basically profitability heuristics for
speculative PRE.

>
>
> I've also looked a more general rewrite of our PRE algorithm when a
> pointer is known dereferenceable, but I haven't figured out to judge
> profitability accurately (yet).  The general approach was to just take the
> cut provided by MDA, apply "obvious" improvements (i.e. local merge
> points), the insert loads in all the unavailable blocks if it looked
> profitable.
>

See also paper "*Register promotion by sparse partial redundancy
elimination of loads and stores"  for a simple way of evaluating
speculation savings.*

>
> The alternate approach did have the appeal of being "easy" in cases where
> our current approach (work backwards from load) is "hard".
>
> One challenge is in making sure the two algorithms together generate a
> stable final form.  :)
>
> That last part is where I stalled.  I'm trying to see what else I can do
> with simpler things before returning to that approach.
>
> Nick Lewicky also pointed out that PHITranslation is problematic in the
> current code.  Oddly, I've never seen that in practice.  We clearly have
> different workloads, but I haven't figured out which are the important
> different properties yet.
>
>
>
>
>
>
>> If we'd split this into a loop nest structure, we'd still have the chain,
>> but a) that's easier to control with a custom pass order since LICM and
>> LoopUnswitch are far cheaper than GVN/PRE and b) having LICM do a bit of
>> trivial loop unswitch for the terminator of the header appears quite
>> tractable.
>>
>
>
>  if (c)
> {
>    x = this->x;
>    if (!x) return;
>    y = this->y;
>    if (!y) return;
> }
> while (c) {
>   if (x == y) {
>       rare();
>       if (!x) return;
>       if (!y) return;
>    }
> }
>
>  The 'branch PRE' (hoisting, sinking, assertion prop and branch
> elimination) part is a little tricky to do though.
>
> I think "a little tricky" might be an understatement here.
>

agreed.

>
> At least to start with, I'd probably try to handle simple cases.  Even
> just doing trivial loop unswitch in the loop with LoadPRE would unlock a
> lot.  (Right now, I'm just running LICM and Unswitch in a loop.  It's not
> that expensive, gets a lot of cases, but doesn't get everything.)
>

Have fun with it :)

David

>
>
> Philip
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150119/7de2ba89/attachment.html>