[PATCH] D58902: [AMDGPU] Support for v3i32/v3f32
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 6 11:09:01 PST 2019
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInstrInfo.cpp:577
+ // TODO: Copy vec3/vec5 with s_mov_b64s then final s_mov_b32.
+ if (!(RI.getRegSizeInBits(*RC) % 64)) {
Opcode = AMDGPU::S_MOV_B64;
----------------
tpr wrote:
> arsenm wrote:
> > tpr wrote:
> > > rampitec wrote:
> > > > Does that mean for a v3 we will have 3 s_mov_b32 instead of s_mov_b64 + s_mov_b32? That is suboptimal.
> > > Yes.
> > >
> > > I can't find any lit test or graphics shader that attempts to copy a vec3 of sgprs here. Any idea how to provoke it so I can try it?
> > A v3 phi, and probably at -O0
> I managed to provoke it, but fixing it looks less than trivial as we need to somehow allow for a vec3 having a sub0_sub1 and a sub2. And it looks pretty rare anyway. So I won't fix it here. Is that ok?
Yes
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58902/new/
https://reviews.llvm.org/D58902
More information about the llvm-commits
mailing list