[PATCH] D104997: [WIP][AMDGPU] Deduce attributes with the Attributor

Mon Jul 5 12:11:03 PDT 2021

jdoerfert added a comment.

Can you add a test:

  kernel() { // uniform-work-group-size = true
    weak();
  }

  weak() {  // weak linkage
    internal();
  }

  int G;
  internal() {  // internal linkage
    G = 0;
  }

here internal should have uniform-work-group-size = true.

Also, add:

  kernel() {  // uniform-work-group-size = true
    internal1();
  }
  internal1() {  // internal linkage, same below.
    internal2();
  }
  int G;
  internal2() {
    if (G == 0) {
      internal3();
      internal2();
    }
  }
  internal3() {
    G = 1;
  }

everything should have uniform work group size = true

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:170-175
+      if (!F->hasExactDefinition()) {
+        LLVM_DEBUG(dbgs() << "[AMDWorkGroupSize] Giving up: " << F->getName() << "\n");
+        NewUniformWorkGroupSize = false;
+        return true;
+      }
+
----------------
As I said before, this is not needed.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:195
+    bool AllCallSitesKnown = true;
+    A.checkForAllCallSites(CheckCallSite, *this, false, AllCallSitesKnown);
+
----------------
you need to check the return value here, I think.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:233
+private:
+  Optional<bool> UniformWorkGroupSize = llvm::None;
+};
----------------
Maybe you just use the boolean state instead? Having the additional boolean here doesn't seem to bring any benefit. You can probably just expose `clampStateAndIndicateChange` in the Attributor.h (`AA` namespace) and then use it in the `CheckCallSite` callback. There should not be anything to do but ask for the callee AA and clamp. This should remove 20 or so lines here.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:356
+    return Change;
+  }
+
----------------
I don't think the address space stuff is good. We should use a dedicated AA to track global uses or reuse AAMemoryLocations to ask what globals are accessed. The latter doesn't capture non-access uses though, so maybe the former is needed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104997/new/

https://reviews.llvm.org/D104997