[PATCH] D109870: [AMDGPU] Enable the pass "amdgpu-replace-lds-use-with-pointer"

Wed Sep 22 08:30:46 PDT 2021

arsenm added a comment.

In D109870#3014083 <https://reviews.llvm.org/D109870#3014083>, @hsmhsm wrote:

> At the moment, it is not *very clear* if LLVM IR with static alloca after call
> is legal or not.

Calls have absolutely nothing to do with allocas. Allocas may legally appear anywhere, this isn't in question. Allocas are strongly recommended to be placed in the entry block to enable optimizations. Further, clustering them at the top of the entry block is nicer IR. This is in no way a requirement, but an optimization pass that doesn't strictly need to touch every alloca doesn't need to find every alloca

> In this pass, since we need to split the entry block before any call, any static
> alloca after the call will be converted into dynamic alloca which we want to avoid.
>
> So, for now, we skip running this pass, if there a kernel which has static
> alloca after call, and we will revisit this code later once we have clear clarity
> on the placement of static alloca.

This pass has no reason to inspect calls. What you care about is splitting the entry block to insert some initialization code. In the process of doing this, you want to avoid moving allocas out of the entry block. This is trivial to do if all the allocas are clustered at the top of the block. If you ignore allocas beyond the top cluster, there is nothing wrong with the IR. If you happen to sink some out of the entry block, it's still correct, just suboptimal.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109870/new/

https://reviews.llvm.org/D109870