[llvm-dev] [SCEV] getMulExpr could be extremely slow when creating SCEVs for a long chain of add/mul instructions
Kevin Choi via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 3 12:16:24 PDT 2016
Correct me if I'm wrong, is the code trying to expand geometric series into
polynomial expression? (that doesn't sound like a very good thing to do, it
would spend so much time computing coefficients)
Wouldn't this be better to transform as below:
mult *= n++ // 30 times
; into
mult = mult (n)...(n+29)
= mult 1...(n+29)/((1)...(n-1))
= mult * geo_sum(n+29) / geo_sum(n-1) ; is this more expensive than
leaving as geo series?
Regards,
Kevin
On 3 August 2016 at 11:46, Huang, Li1 via llvm-dev <llvm-dev at lists.llvm.org>
wrote:
> Hi,
>
>
>
> I’m working on a slow-compile problem caused by SCEV (PR28830), and I need
> your suggestions on how to fix it. The loop below causes
> ScalarEvolution::getMulExpr to hang.
>
>
>
> int get(unsigned n)
>
> {
>
> unsigned i, j, mult = 1;
>
> for (i = 0; i < 1; i++) {
>
> for (j = 0; j < 30; j++) {
>
> mult *= n++;
>
> }
>
> }
>
> return mult;
>
> }
>
>
>
> the inner loop is completed unrolled (by 30) to a long chain of “add” and
> “mult” instructions, then indvars follows loop-unroll and triggers SCEV
> creations for this long instruction chain. When it comes to getMulExpr, it
> tries to fold multiplications of AddRecs of the same induction variable
> recursively, this could be very slow if the expression tree is deep.
>
>
>
> Code responsible for this problem starts here (in ScalarEvolution.cpp):
>
>
>
> // Okay, if there weren't any loop invariants to be folded, check to
> see if
>
> // there are multiple AddRec's with the same loop induction variable
> being
>
> // multiplied together. If so, we can fold them.
>
>
>
> // {A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>
>
> // = {x=1 in [ sum y=x..2x [ sum z=max(y-x, y-n)..min(x,n) [
>
> // choose(x, 2x)*choose(2x-y, x-z)*A_{y-z}*B_z
>
> // ]]],+,...up to x=2n}.
>
> // Note that the arguments to choose() are always integers with values
>
> // known at compile time, never SCEV objects.
>
> //
>
> // The implementation avoids pointless extra computations when the two
>
> // addrec's are of different length (mathematically, it's equivalent to
>
> // an infinite stream of zeros on the right).
>
> bool OpsModified = false;
>
> for (unsigned OtherIdx = Idx+1;
>
> OtherIdx != Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);
>
> ...
>
> ...
>
>
>
> One way I thought about to fix this is to have a threshold for the number
> of terms in each AddRec when folding them. For example, we only fold
> AddRec1 * AddRec2 when (#terms in AddRec1 + #terms in AddRec2) <
> threshold. I tried setting the threshold to 8, and it stops early for this
> case. Another way is to limit the depth of this folding, but this might
> require change of API.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160803/37c56333/attachment.html>
More information about the llvm-dev
mailing list