[llvm-dev] [RFC] Value Range Based Optimization Opportunity in LLVM

Wed Sep 6 08:43:01 PDT 2017

Hi Daniel,

Thank you very much for you valuable feedback.  We are looking into the 
range analysis you mentioned. Once we have a better understand about it, 
we can discuss more in the follow-up emails.

Regards, 

Tony Jiang, M.Sc.
LLVM PPC Backend Development
IBM Toronto Lab, C2/712/8200/MKM
8200 Warden Ave, Markham, L6G 1C7
Email: jtony at ca.ibm.com
Phone: 905-413-3676

From:   Daniel Berlin <dberlin at dberlin.org>
To:     Hal Finkel <hfinkel at anl.gov>
Cc:     Tony Jiang <jtony at ca.ibm.com>, llvm-dev <llvm-dev at lists.llvm.org>
Date:   08/31/2017 10:08 PM
Subject:        Re: [llvm-dev] [RFC] Value Range Based Optimization 
Opportunity in LLVM

NewGVN does some of it.
It will discover if backpropagation through a phi enables an operation to 
be put in terms of existing expressions or constants.
It does not track ranges though, only explicit values.

In your ORIG block, if LENGTH can be expressed as a merge of 
 already-computed expressions or constants in the program, it will 
discover it.

You can stare at examples of this (and compute quite complicated ones if 
you wish) at test/Transforms/NewGVN/completeness.ll

Both your examples already have a phi in it in llvm ir, so that is just a 
matter of the proper range propagation.
It is really:

int test()
{
 int Ret0 = 0;
  if (!Ptr)
     Ret1= calli(a)
  RetPhi = PHI(Ret0, Ret1)
  return RetPhi;
}

It should already be possible to determine the value of Ret0 to be 0 based 
on ranges without duplicating anything, and the extra return doesn't give 
you anything. 

If you want to do it based on ranges:  a combination of a value range 
computation using the existing sparse propagation engine, and a transform 
at phi nodes like we use for NewGVN, should be able to handle what you 
want here.    The existing lazy value info may not get an optimal answer 
for various reasons (it is trying to propagate in the wrong direction, so 
it loses some context)

You could also use the same basic infrastructure that predicateinfo uses, 
and rename info at points the ranges could have changed to get additional 
precision.
 (unlike predicateinfo, which renames when there are branches). 

There are quite a number of papers on making SSA forms that are effective 
at range propagation. PredicateInfo is building pruned e-ssa, which is the 
basis of a number of range analysis (see 
http://homepages.dcc.ufmg.br/~fernando/publications/papers/CGO13_raphael.pdf 
and the slides at 
http://laure.gonnord.org/pro/research/ER03_2015/RangeAnalysis.pdf).

The only case you should ever need to duplicate code is when the 
expression is something not already computed in the program.  That's not 
the case for your first example, it can be done without code duplication.  
For your second example, it would require insertion, but you could know 
ahead of time which computations you actually need to insert.

On Thu, Aug 31, 2017 at 6:22 PM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

On 08/31/2017 03:54 PM, Tony Jiang via llvm-dev wrote:
Hi All,

We have recently found some optimization opportunities created by 
replicating code into branches in order to enable optimization. In 
general, the optimization opportunity we are pursuing is like the 
following.

Given pseudo-code:

// block A
if (some condition)
  // block B
// block C

If it can be efficiently proven that some portion of block C can be 
simplified had control flow not gone through the if statement, it might be 
useful to convert this CFG into a diamond and hoist that portion of block 
C into both block B and the new block.

Consider the following example:

int test(int *Ptr, int a, int b, int c, int d) { 
  int Ret = 0;
  if (__builtin_expect(!Ptr, 0)) {
    Ret = calli(a);
    // return Ret / (a|1) / (b|1) / (c|1) / (d|1); // Copy return to here
  }
  return Ret / (a|1) / (b|1) / (c|1) / (d|1); // This can be simplified to 
return 0
} 

In this case, replicating the return statement in the branch allows the 
optimizer to prove the value of Ret at the end of the function is 0 and 
eliminate the arithmetic calculations.

A second example: 

unsigned long funcReturningArbitraryi64(unsigned char *p);
#define LENGTH(uv)  ( (uv) < 0x80             ? 1 :  \
                      (uv) < 0x800            ? 2 :  \
                      (uv) < 0x10000          ? 3 :  \
                      (uv) < 0x200000         ? 4 :  \
                      (uv) < 0x4000000        ? 5 :  \
                      (uv) < 0x80000000       ? 6 : 7 )

int func(unsigned char *p, bool flag)
{
  unsigned long c = *p;
  int len;
  // ...
#ifdef _ORIG
  if(flag) {
    // ...
    c = funcReturningArbitraryi64(p);
  }
len = LENGTH(c);
#else
  if(flag) {
    // ...
    c = funcReturningArbitraryi64(p);
    len = LENGTH(c);
  } else {
    len = LENGTH(c);
  }
#endif

  // ...

  return len;
}

In this case, we see that creating an else block and replicating the 
return statement in both the if and else branch (like the code snippet 
guarded by the #else) enables the macro UNISKIP in the else branch to be 
optimized.

Most of the examples we have come up with so far are centered around value 
ranges along the conditional branches. When the range of values a symbol 
can have along different branches is provably different, opportunities for 
optimization may arise. However, this likely isn't the only category of 
optimizations that could benefit from this.

Is there an existing transformation in LLVM that should be doing this 
already that is missing this opportunity? If not, we would like to pursue 
adding this. Of course, this optimization would be subject to a cost model 
as it may result in code growth. For example, it may not be advantageous 
to do this if the simplified branch is cold. If anyone has any 
comments/suggestions we are very much interested in hearing them.

We have two transformations that track ranges along CFG edges using 
LazyValueInfo: JumpThreading and CorrelatedValuePropagation. I don't think 
that either does what you're proposing.

 -Hal

Regards, 

Tony Jiang, M.Sc.
LLVM PPC Backend Development
IBM Toronto Lab, C2/712/8200/MKM
8200 Warden Ave, Markham, L6G 1C7
Email: jtony at ca.ibm.com
Phone: 905-413-3676

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170906/039e4cfe/attachment.html>