[PATCH] D35524: [AMDGPU] Add support for Whole Wavefront Mode

Wed Jul 26 17:11:04 PDT 2017

cwabbott added inline comments.

================
Comment at: test/CodeGen/AMDGPU/wqm.ll:127-185
+; Check that we don't leave WWM on for computations that don't require WWM,
+; since that will lead clobbering things that aren't supposed to be clobbered
+; in cases like this.
+;
+;CHECK-LABEL: {{^}}test_wwm2:
+;CHECK: s_mov_b64 [[ORIG:s\[[0-9]+:[0-9]+\]]], exec
+;CHECK: s_mov_b64 exec, -1
----------------
nhaehnle wrote:
> Maybe I have jetlag blindness, but those two tests seem to be the same.
> 
> Also, there should be a test where the wwm computation does use some value from the predecessor block (e.g. use %hi as the offset in the load).
They're not, since the second one uses lacks the iadd after the llvm.amdgcn.wwm intrinsic. Instead, the intrinsic result is passed directly to the phi. Without the early clobber on the WWM intrinsic, we would coalesce the resulting copy with the other phi sources, and the WWM write would overwrite the inactive channels that should be 0. There's a similar thing going on with the above, but this is easier to verify in a less fragile way -- we just need to make sure that a v_mov_b32 gets emitted after the WWM computation. I'm currently jetlagged too, but I think that's why I added this test.

Good point about the additional test though.

https://reviews.llvm.org/D35524