[PATCH] D64328: [AMDGPU] Optimize atomic max/min

Wed Jul 10 01:24:24 PDT 2019

foad marked 4 inline comments as done.
foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:328
+  case AtomicRMWInst::UMax:
+    IdentityVal = APInt::getMinValue(TyBitWidth);
+    break;
----------------
arsenm wrote:
> Shouldn't this just be zero for add/sub?
Yes, getMinValue means min unsigned value which is zero.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:330-338
+  case AtomicRMWInst::UMin:
+    IdentityVal = APInt::getMaxValue(TyBitWidth);
+    break;
+  case AtomicRMWInst::Max:
+    IdentityVal = APInt::getSignedMinValue(TyBitWidth);
+    break;
+  case AtomicRMWInst::Min:
----------------
arsenm wrote:
> The min and max values seem backwards? min(MinValue, X) -> MinValue, not X
I got one of these cases wrong in the first patch but they should all be correct now. The identity for min() is MaxValue and vice versa.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp:388
       CallInst *const ReadLaneLo = B.CreateIntrinsic(
           Intrinsic::amdgcn_readlane, {}, {ExtractLo, B.getInt32(63)});
       CallInst *const ReadLaneHi = B.CreateIntrinsic(
----------------
arsenm wrote:
> I don't understand why these are reading lane 63. Why not lane 0? Why not readfirstlane? This won't work with wave64, but I guess that's an existing problem
Here's how the exclusive scan works:
ExclScan lane 0 is 0
ExclScan lane 1 is V lane 0
ExclScan lane 2 is V lane 0+1
ExclScan lane 3 is V lane 0+1+2
...
ExclScan lane 63 is V lane 0+...+62
Here we're reading lane 63 of NewV = ExclScan+V, i.e. the sum of all lanes in V.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64328/new/

https://reviews.llvm.org/D64328