[llvm-dev] [SCEV] getMulExpr could be extremely slow when creating SCEVs for a long chain of add/mul instructions

Kevin Choi via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 3 12:16:24 PDT 2016


Correct me if I'm wrong, is the code trying to expand geometric series into
polynomial expression? (that doesn't sound like a very good thing to do, it
would spend so much time computing coefficients)

Wouldn't this be better to transform as below:
mult *= n++   // 30 times

; into
mult = mult (n)...(n+29)
       = mult 1...(n+29)/((1)...(n-1))
       = mult * geo_sum(n+29) / geo_sum(n-1)  ; is this more expensive than
leaving as geo series?

Regards,
Kevin

On 3 August 2016 at 11:46, Huang, Li1 via llvm-dev <llvm-dev at lists.llvm.org>
wrote:

> Hi,
>
>
>
> I’m working on a slow-compile problem caused by SCEV (PR28830), and I need
> your suggestions on how to fix it. The loop below causes
> ScalarEvolution::getMulExpr to hang.
>
>
>
> int get(unsigned n)
>
> {
>
>     unsigned i, j, mult = 1;
>
>     for (i = 0; i < 1; i++) {
>
>         for (j = 0; j < 30; j++) {
>
>             mult *= n++;
>
>         }
>
>     }
>
>     return mult;
>
> }
>
>
>
> the inner loop is completed unrolled (by 30) to a long chain of “add” and
> “mult” instructions, then indvars follows loop-unroll and triggers SCEV
> creations for this long instruction chain. When it comes to getMulExpr, it
> tries to fold multiplications of AddRecs of the same induction variable
> recursively, this could be very slow if the expression tree is deep.
>
>
>
> Code responsible for this problem starts here (in ScalarEvolution.cpp):
>
>
>
>    // Okay, if there weren't any loop invariants to be folded, check to
> see if
>
>     // there are multiple AddRec's with the same loop induction variable
> being
>
>     // multiplied together.  If so, we can fold them.
>
>
>
>     // {A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>
>
>     // = {x=1 in [ sum y=x..2x [ sum z=max(y-x, y-n)..min(x,n) [
>
>     //       choose(x, 2x)*choose(2x-y, x-z)*A_{y-z}*B_z
>
>     //   ]]],+,...up to x=2n}.
>
>     // Note that the arguments to choose() are always integers with values
>
>     // known at compile time, never SCEV objects.
>
>     //
>
>     // The implementation avoids pointless extra computations when the two
>
>     // addrec's are of different length (mathematically, it's equivalent to
>
>     // an infinite stream of zeros on the right).
>
>     bool OpsModified = false;
>
>     for (unsigned OtherIdx = Idx+1;
>
>          OtherIdx != Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);
>
> ...
>
> ...
>
>
>
> One way I thought about to fix this is to have a threshold for the number
> of terms in each AddRec when folding them. For example, we only fold
> AddRec1 * AddRec2 when  (#terms in AddRec1 + #terms in AddRec2) <
> threshold. I tried setting the threshold to 8, and it stops early for this
> case. Another way is to limit the depth of this folding, but this might
> require change of API.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160803/37c56333/attachment.html>


More information about the llvm-dev mailing list