[PATCH] D30810: Preserve vec3 type.

JinGu Kang via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 22 10:40:05 PDT 2017


jaykang10 added a comment.

> Yes. This would make sense. I am guessing that in vec3->vec4, we will have 3 loads and 4 stores and in vec4->vec3 we will have 4 loads and 3 stores?

It depends on implementation. If you scalarize all vector operations on LLVM IR level before entering llvm's codegen, the vec3->vec4  would generate 3 loads and 4 stores and the vec4->vec3 would generate 4 loads and 3 stores. I guess your implementation follows this situation. The AMDGPU target keeps the vector form on LLVM IR level and handles it with legalization of SelectionDAG on llvm's codegen. On this situation, vec3->vec4 generates 4 loads and 4 stores because type legalizer widen vec3 load to vec4 load because the alignment is 16. vec4->vec3 generates 4 loads and 3 stores with type legalization. The output could be different according to llvm's backend implementation.

> Although, I am guessing the load/store optimizations would come from your current change? I don't see anything related to it in the VisitAsTypeExpr implementation. But I think it might be a good idea to add this to the test to make sure the IR output is as we expect.

We should implement the IRs with option. I will update patch with it.


https://reviews.llvm.org/D30810





More information about the cfe-commits mailing list