[PATCH] D13529: [Polly] Allow alloca instructions in the SCoP

Thu Oct 8 03:41:54 PDT 2015

jdoerfert added a comment.

Some comments from me. If anybody has an example (code or situation) that might be a problem, please feel free to share it.

------------------------------------------

In http://reviews.llvm.org/D13529#262273, @Meinersbur wrote:

> alloca is most definitely not like any other instruction. When placed in a loop, the stack will grow in every iteration an eventually lead to a stack overflow. Such code is more than evil. How did you even manage to make Clang output such code? It should also have added stacksave/stackrestore intrinsics.

1. Yes alloca has a side effect but none that needs to be modeled, hence my commit message. We will __not__ change/increase the domain of an alloca, thus if it was executed N times before and allocated stack space each time it will do so after Polly too. "Evil code" stays evil, good code stays good.
2. Clan generates allocas in regions (conditionals/loops) sometimes, this output was placed by hand but in the llvm-test suite + some benchmark suites I know we have alloca's in loops.
3. It would probably add livetime intrinsics not stacksave. The former we handle by ignoring them.

> The question is, whether it matters for Polly. How does it handle the situation that %B refers to different memory in every iteration? Are there infinitely many arrays in the scop, since there are infinitely many base addresses? Are base addresses that are variant within the scop even valid?

This is **the** interesting question, thanks for asking. With this commit we will not model that %B = alloca will ever change, hence we will model less locations. However:

1. When you execute %B = alloca in a loop you loose your handle to %B each iteration and can only access the old stack space via a pointer that escaped in a phi node. The base address for those will be the phi. Nevertheless, this would again model less locations than there actually are.
2. With this patch we model the locations only once (maybe a small number of times if the alloca escapes in a phi, see 1)) it remains therefor to argue that this is safe. I argue it can be. (There might be something missing here but so far this patch only deleted code ;))

> Since the semantics of allocas in loops are dangerous and there is probably no valid use for them without stackrestore, I strongly recommend against this.

1. 3 out of 4 things we do have tricky semantics. I would not call allocas dangerous.
2. As mentioned before, LLVM-Test suite as well as benchmarks (e.g., alignment in BOTS) have allocas in loops. It happens and for the later we could do some nice optimizations.

Btw, stacksave/restore is only used/needed for variable sized arrays, I guess you mean the lifetime markers.

In http://reviews.llvm.org/D13529#262273, @deadalnix wrote:

> Pretty much, there is a lot of problem that arise when you put alloca in random places.

True, but we don't put anything in random places. Dependences are computed for the allocated locations as well as the alloca itself and they are obeyed during scheduling/code generation. Additionally, the domain (number of executions) of the alloca is not changed.

In http://reviews.llvm.org/D13529#262273, @grosser wrote:

> I am also surprised 'alloc' is not different to any other instruction.

At least not if you don't want to benefit from the semantics. A sound over-approximation should be to just model them as this patch does.
Later we might want to consolidate allocas or model the different memory locations (and the lifetime markers) to get rid of dependences that do not exist.

> We can possibly support it in some way, but this requires some additional thoughts similar to what Michael suggested. Will the varying base pointer cause troubles, .../. Can you dead-code-eliminate allocs? Can you reorder allocs?

I don't think anything that is enabled by default will cause trouples. dead-iteration-elimination might, I will come back to you on this. Reordering is fine as the semantics of IR does not gurantee a stack layout.

> Do you actually have a use case that produces such kind of code and that makes it worth to support this case? As deadalnix pointed out, allocas outside of the entry block have a lot of surprising behaviors (and are e.g. not even considered by mem2reg for good reasons).!

My use case for now are the LLVM-Test suite regions with allocas but later benchmarks like alignment in BOTS. I intend to model lifetime markers too at some
point to enable transformations that are prevented now by spurious dependences.
The reason mem2reg doesn't support them is unknown to me, if you have any insights please share them so we can discuss if it makes problems for us.

http://reviews.llvm.org/D13529