[PATCH] D13529: [Polly] Allow alloca instructions in the SCoP

Thu Oct 8 09:18:35 PDT 2015

Meinersbur added a comment.

In http://reviews.llvm.org/D13529#262640, @jdoerfert wrote:

> Some comments from me. If anybody has an example (code or situation) that might be a problem, please feel free to share it.

At the moment I can think of three situations where this breaks:

1. Invariant Load hoisting. Because the index is constant (but not the BaseAddrs), invariant load hoisting might move the load to before the scop. This yields invalid code because the alloca is not dominating it.
2. My De-LICM approach will reuse elements in arrays when known to be overwritten. If the base address of an array changes, the element will point to some new location, effectively forgetting what was written in that element.
3. Such alloca's size might depend on ivs/values computed within the scop. Delinirization might interpret it as index. If not, the size expression suddenly depends on instructions within the scop. It is not even an affine term anymore, although the index is.

Polly was written with the assumption that base addresses are defined and fixed during the scop's execution. IMHO it is not understood enough to remove the check with a single test case. It changes the meaning of a base address from an array in the sense of the polyhedral model to maybe "pointer to memory that currently does not alias with anything else".

> 1. Yes alloca has a side effect but none that needs to be modeled, hence my commit message.

The commit message is still wrong. "Not necessary to be modeled" is not "no different to other instructions"

> We will __not__ change/increase the domain of an alloca, thus if it was executed N times before and allocated stack space each time it will do so after Polly too. "Evil code" stays evil, good code stays good.

IMHO there is no need to optimize code that should not have been written in the first place.

> 2. Clan generates allocas in regions (conditionals/loops) sometimes, this output was placed by hand but in the llvm-test suite + some benchmark suites I know we have alloca's in loops.

What is "sometimes"? To my best knowledge Clang will always put allocas in the entry block if the the size is known in advance. The expections are C99 VLAs which are put between stacksave and stackrestore - and manual calls to the alloca builtins.

> 3. It would probably add livetime intrinsics not stacksave. The former we handle by ignoring them.

Nope. But maybe you have an actual example when clang generates an an alloca within a loop without stacksave to prove me wrong. Lifetime intrinsics don't matter here.

> > The question is, whether it matters for Polly. How does it handle the situation that %B refers to different memory in every iteration? Are there infinitely many arrays in the scop, since there are infinitely many base addresses? Are base addresses that are variant within the scop even valid?

> 

> 

> This is **the** interesting question, thanks for asking. With this commit we will not model that %B = alloca will ever change, hence we will model less locations. However:

> 

> 1. When you execute %B = alloca in a loop you loose your handle to %B each iteration and can only access the old stack space via a pointer that escaped in a phi node. The base address for those will be the phi. Nevertheless, this would again model less locations than there actually are.

> 2. With this patch we model the locations only once (maybe a small number of times if the alloca escapes in a phi, see 1)) it remains therefor to argue that this is safe. I argue it can be. (There might be something missing here but so far this patch only deleted code ;))

Will there be correct RAW/WAW/WAR dependencies if it escapes through a phi? Explain how.

> > Since the semantics of allocas in loops are dangerous and there is probably no valid use for them without stackrestore, I strongly recommend against this.

> 

> 

> 1. 3 out of 4 things we do have tricky semantics. I would not call allocas dangerous.

allocas in loops without stackrestore.

Your test code is "designed" to crash after at most 26215 iterations. (assuming 1 MB stack size). It might work because the actual number of iterations is constant.

> 2. As mentioned before, LLVM-Test suite as well as benchmarks (e.g., alignment in BOTS) have allocas in loops. It happens and for the later we could do some nice optimizations.

Can you show what IR generates from what C source?

> Btw, stacksave/restore is only used/needed for variable sized arrays, I guess you mean the lifetime markers.

Yes, VLAs, and I really mean stacksave/stackrestore. I even tried out bofore posting my first comment.

If not a VLA, clang puts it into the entry BB.

http://reviews.llvm.org/D13529