[PATCH] D54606: [AMDGPU] Convert insert_vector_elt into set of selects

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 16 10:43:42 PST 2018


rampitec added a comment.

In https://reviews.llvm.org/D54606#1301261, @arsenm wrote:

> Another change I've wanted to look at is changing what AMDGPUPromoteAlloca tries to produce. The dynamic vector indexing is going to be worse if the waterfall loop is going to be required, but it's currently preferred if both are possible.


Essentially it deals with the alloca to begin with. Scratch is always worse than movrel (even with the waterfall loop), movrel is worse than a small set of selects. So this is quite natural we can get either a set of selects or a movrel from an alloca. Unless we end up spilling later, but this is too early to estimate register pressure here.

Anyway, any ideas what can it produce better?


https://reviews.llvm.org/D54606





More information about the llvm-commits mailing list