[PATCH] D30810: Preserve vec3 type.

Fri Mar 24 10:33:08 PDT 2017

jaykang10 added a comment.

> I believe the argument lacks numbers (or at least you have them but didn't mention). I didn't hear about performance results, or validation that this was actually tested for correctness. Small test cases prove a point, but can't be considered general.
> 
> OTOH, it seems like this is exactly why you want the flag, to hide any potential issues and play with it. I'm not opposed to adding the flag if there's a commitment to actually get the result and change the default to whatever seems to be better, does that seems reasonable?

To be honest, I did not start this discussion to get better performance for GPU. But the goal has moved while we discussing it. I think I have showed you the gain from AMD GPU target on previous comments. On current version of clang/llvm, it generates one less load/store to preserve vec3 type instead of vec4 type on GPU target. But it could cause more load/store on another targets which have vector register because they usually expect aligned vector load/store. It means that if we want to change default  to vec3 on clang, we need to change llvm's codegen's default legalization or backend targets need to be modified in order to get aligned vector load/store with vec3 type. I think It is too big change at this moment. In the future, we could move this discussion to llvm-dev. Up to that time,  as you mentioned, I would like to hide the potential issues with vec3 type and play with it.

https://reviews.llvm.org/D30810