[PATCH] D57737: [AMDGPU] Fix DPP sequence in atomic optimizer.

Neil Henning via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 7 05:02:26 PST 2019


sheredom marked an inline comment as done.
sheredom added inline comments.


================
Comment at: lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:316-317
+    LaneOffset = B.CreateIntrinsic(Intrinsic::amdgcn_wwm, Ty, NewV);
+    NewV = B.CreateIntrinsic(Intrinsic::amdgcn_wwm, Ty,
+                             B.CreateBinOp(Op, NewV, SetInactive));
 
----------------
nhaehnle wrote:
> So I hadn't noticed this before, but I think the wwm intrinsic shouldn't be applied *after* the readlane below.
> 
> With wwm before readlane, there's a theoretical possibility that register allocation splits the live range of the value and inserts a V_MOV in between which ends up executed with bit 63 disabled, leading to an incorrect results from the readlane.
Oh yeah - so I was assuming because readlane ignores the exec mask it should be fine, but I can see why that might be an issue. I'll change the code.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57737/new/

https://reviews.llvm.org/D57737





More information about the llvm-commits mailing list