[LLVMdev] [polly] scev codegen (first step to remove the dependence on ivcanon pass)

Sat Dec 1 15:42:25 PST 2012

On Fri, Nov 30, 2012, at 08:46 PM, Sebastian Pop wrote:
> Hi Tobi,
> 
> I would like to remove the SCEVRewriter code and replace it with a call
> to 
> SCEVAddRec::apply (see attached a patch that adds just this function). 
> More
> precisely I want to add another function called apply_map that applies a
> map
> (loop -> expr) on a given scev.  This is the apply function on a
> multi-variate
> polynomial.

Hi Sebastian,

thanks for working on removing our dependence on iv canonicalization.

Let me first comment on the approach:

> So here is an overview of how I would like the scev code generator to
> work on an
> example: supposing that we have a Stmt_1 that gets code generated by
> either
> CLooG or ISL-codegen like this:
> 
>   Stmt_1(c1, c1+4, c1+c2);
> 
> we will construct a map that maps the old iteration domain with 3
> dimensions
> (there are 3 arguments in Stmt_1 representing the original loop nest in
> which
> Stmt_1 was located, let's call the original loop nest loop_1, loop_2,
> loop_3) to
> the new expressions generated by cloog that are function of the new
> iteration
> domain with 2 dimensions (c1 and c2 are the new induction variables of
> the code
> generated by cloog).  So the content of that map is:
> 
> loop_1 -> c1
> loop_2 -> c1+7
> loop_3 -> c1+c2
> 
> Given an access function from the original program: 
>   Scev_1 = {{{0, +, 4}_1, +, 5}_2, +, 6}_3
> 
> we will apply the map on it, and we will get a symbolic expression
> function of
> the new induction variables c1, and c2:
> 
> Scev_2 = apply (loop_1 -> c1)    on Scev_1 = {{4*c1, +, 5}_2, +, 6}_3
> Scev_3 = apply (loop_2 -> c1+7)  on Scev_2 = {4*c1 + 5*(c1+7), +, 6}_3
> Scev_4 = apply (loop_3 -> c1+c2) on Scev_3 = 4*c1 + 5*(c1+7) + 6*(c1+c2)
> 
> We will then code generate Scev_4 and we will use this new expression for
> the
> array access in the new loop nest.

Yes. This is very similar to what I had in mind and what I already
started to implement. There are some slight differences:

You create a map from the old_loop to a symbolic expression. What type
would this symbolic expression have? Would
it be a SCEVExpr? At the moment, we calculate at the beginning of each
polly statement (a basic block) a value for each virtual induction
variable of the original loops. For your example we get something like
this.

new_bb:
     new_iv_1 = add i32 c1, 0
     new_iv_2 = add i32 c1, 7
     new_iv_3 = add i32 c1, c2

     ....

I want to highlight here that the values new_iv_1, new_iv_2, new_iv_3
correspond to the number of loop iterations of the original loop. As we
currently require canonical induction variables, they are equivalent to
the value of the old canonical induction variable. However, generating
those values does not imply that we need to have canonical induction
variables in the original code. Even without canonical induction
variables, calculating such values is useful, as this simplifies the map
you describe above. Instead of going from Loop* to some symbolic
expression, you can just pass the corresponding Value* or ScevUnknown*.
In your example this would be

Scev_2 = apply (loop_1 -> new_iv_1)  on Scev_1 = {{4*new_iv_1, +, 5}_2,
+, 6}_3
Scev_3 = apply (loop_2 -> new_iv_2)  on Scev_2 = {4*new_iv_1 +
5*new_iv_2, +, 6}_3
Scev_4 = apply (loop_3 -> new_iv_3)  on Scev_3 = 4*new_iv_1 + 5*new_iv_2
+ 6*new_iv_3

Passing a symbolic expression, as you propose, could allow further
simplifictions, however it also requires s to
translate an isl_ast_expr to some kind of ScevExpr. This is non-trivial
as we would need to teach SCEV about the different
operands isl codegen could produce (floord, ceild, min, max, %, ....).
I would suggest to leave this optimization for later and to first remove
the dependence on the ivdeps pass. We can than evaluate, if this
optimization should be done in SCEV or if we rather rely on instcombine
or something similar to optimize the generated instructions further.

> Remark that in all this process we have never referred to the original
> "canonical induction variable".  SCEV actually provides such a canonical
> form
> for the induction variables without having to transform the code.

Yes, this remains true with the modification I am proposing.

Now to the removal of the SCEVRewriter. I am not opposed to it, but
wonder what the benefit of removing it would be? Do you think moving
this transformation directly into SCEV is conceptually nicer or is there
some additional benefit. I have mainly two points we should think about
before removing it:

1. What happens to parameters

Besides rewriting the loop ivs, the SCEVRewriter also rewrites
parameters. It takes a map [<SCEV* -> Value*>] and replaces SCEV
expressions which we have recognized during code generation with a
parameter value. This is necessary,
in case we generate OpenMP / OpenCL code and need to pass parameters to
other functions or modules.

2. Speed

The SCEVRewriter only passes once over the SCEVExpr. In your approach,
we would pass three times over the SCEV, no?
It is probably too early to optimize here, but we should keep that in
mind.

In terms of code to be written:

To remove the dependency of the canonical iv from the code generation,
it should be sufficient to pass a map <Loop*, Value*> to the
BlockGenerator and to use it in SCEVRewriter::getNewIV(). For CLooG, you
should be able to initialize it
from the u->substitutions list. For isl, the information should be
readily available in IslNodeBuilder::createUser().

Cheers
Tobi