[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

Tue Jun 25 04:17:37 PDT 2024

================
@@ -228,10 +228,11 @@ void AMDGPUAtomicOptimizerImpl::visitAtomicRMWInst(AtomicRMWInst &I) {
 
   // If the value operand is divergent, each lane is contributing a different
   // value to the atomic calculation. We can only optimize divergent values if
-  // we have DPP available on our subtarget, and the atomic operation is 32
-  // bits.
+  // we have DPP available on our subtarget, and the atomic operation is either
+  // 32 or 64 bits.
   if (ValDivergent &&
-      (!ST->hasDPP() || DL->getTypeSizeInBits(I.getType()) != 32)) {
+      (!ST->hasDPP() || (DL->getTypeSizeInBits(I.getType()) != 32 &&
+      DL->getTypeSizeInBits(I.getType()) != 64))) {
----------------
arsenm wrote:

You should move the type logic into a separate predicate function. But also, you should base this on the actual type and not just the bitwidth. Checking the bitwidth will break in the future when atomics support more vector operations

https://github.com/llvm/llvm-project/pull/96473