[PATCH] D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass

Thu Apr 6 09:56:26 PDT 2017

arsenm added a comment.

In https://reviews.llvm.org/D31710#719896, @dstuttard wrote:

> As you say, the issue actually is that non-canonical IR was being presented to the pass - the front-end has since been updated to fix this (although possibly not in the way it should).

The frontend emitting it seems fine if convenient, but instcombine will decompose it for you. If you pass pipeline somehow isn't running instcombine, you have a problem.

> The question is, what should this pass do when valid, but non-canonical IR is passed in. My inclination is to strip out the code to transform to vector in this case and instead just more gracefully handle instead (the tryPromoteAlloca function will just return false - as with other cases it can't handle).
> 
> Not sure what you mean with your last comment
> 
>> Similarly we give up on array allocas in the form where they are allocating N items of the element type.
> 
> If you mean that the tryPromoteAlloca gives up for N<2 or N>4 as an example of what to do - then yes, that's what I'm proposing for this case as well.

No, I mean the I.isArrayAllocation() early exit. An alloca with a constant number of elements is turned into an alloca of 1 element of array type by instcombine.

FYI Changpeng has also been looking at this pass to handle more cases.

In https://reviews.llvm.org/D31710#719896, @dstuttard wrote:

> As you say, the issue actually is that non-canonical IR was being presented to the pass - the front-end has since been updated to fix this (although possibly not in the way it should).
>
> The question is, what should this pass do when valid, but non-canonical IR is passed in. My inclination is to strip out the code to transform to vector in this case and instead just more gracefully handle instead (the tryPromoteAlloca function will just return false - as with other cases it can't handle).
>
> Not sure what you mean with your last comment

Non-canonical IR are constructs that the basic canonicalization passes like instcombine produce. This is supposed to make later optimizations lives easier since they don't have to worry about every possible way of representing something. It is sufficient to just not crash or miscompile on the non-canonical inputs.

https://reviews.llvm.org/D31710