[PATCH] D35524: [AMDGPU] Add support for Whole Wavefront Mode

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 2 02:23:30 PDT 2017


nhaehnle accepted this revision.
nhaehnle added a comment.
This revision is now accepted and ready to land.

One remaining style nitpick. LGTM with that fixed.



================
Comment at: lib/Target/AMDGPU/SIFixWWMLiveness.cpp:183
+bool SIFixWWMLiveness::runOnMachineFunction(MachineFunction &MF) {
+  bool modified = false;
+
----------------
nhaehnle wrote:
> Variable names are upper case in LLVM style.
Gentle reminder :)


================
Comment at: test/CodeGen/AMDGPU/wqm.ll:127-185
+; Check that we don't leave WWM on for computations that don't require WWM,
+; since that will lead clobbering things that aren't supposed to be clobbered
+; in cases like this.
+;
+;CHECK-LABEL: {{^}}test_wwm2:
+;CHECK: s_mov_b64 [[ORIG:s\[[0-9]+:[0-9]+\]]], exec
+;CHECK: s_mov_b64 exec, -1
----------------
cwabbott wrote:
> nhaehnle wrote:
> > Maybe I have jetlag blindness, but those two tests seem to be the same.
> > 
> > Also, there should be a test where the wwm computation does use some value from the predecessor block (e.g. use %hi as the offset in the load).
> They're not, since the second one uses lacks the iadd after the llvm.amdgcn.wwm intrinsic. Instead, the intrinsic result is passed directly to the phi. Without the early clobber on the WWM intrinsic, we would coalesce the resulting copy with the other phi sources, and the WWM write would overwrite the inactive channels that should be 0. There's a similar thing going on with the above, but this is easier to verify in a less fragile way -- we just need to make sure that a v_mov_b32 gets emitted after the WWM computation. I'm currently jetlagged too, but I think that's why I added this test.
> 
> Good point about the additional test though.
Wow, that's subtle stuff. Thanks for the explanation!


https://reviews.llvm.org/D35524





More information about the llvm-commits mailing list