[PATCH] D35435: [AMDGPU] Produce flat|global_dwordx3 instructions
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 21 12:57:21 PDT 2017
rampitec added a comment.
In https://reviews.llvm.org/D35435#817495, @vpykhtin wrote:
> The implementation of this approach looks good to me. The only question is which way to go to implement v3 vector.
Probably making v3 generally legal and simple type is a right thing to do. This will solve not only problem with loads, but compute on the resulting vector as well. Currently such compute is done on a 4 component vector created by the legalization with the promote of v3 to v4.
It however seems to be long way because a lot of places just designed to work with a power of 2 vectors, halfing and doubling them. I would do this style vec3 load in the short term and target legal v3 in a long term.
https://reviews.llvm.org/D35435
More information about the llvm-commits
mailing list