[PATCH] D54606: [AMDGPU] Convert insert_vector_elt into set of selects

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 16 09:37:24 PST 2018


rampitec added a comment.

In https://reviews.llvm.org/D54606#1300988, @nhaehnle wrote:

> Mostly looks good to me.
>
> However, why does code with undef vectors look so bad? For example, in `float4_inselt`, the fact that the initial vector is undef should allow us to just store a splat of 1.0.


Yes, I noticed that too. That needs to be a separate optimization. As far as I understand "insert_vector_element undef, %var, %idx" should not even come to this point. It needs to be replaced by build_vector (n x %var) regardless of the thresholds and heuristics I am using, e.g. earlier (higher in the same function I think).


https://reviews.llvm.org/D54606





More information about the llvm-commits mailing list