[llvm] [KernelInfo] Implement new LLVM IR pass for GPU code analysis (PR #102944)

Joel E. Denny via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 11 16:33:52 PDT 2024


================
@@ -1390,3 +1390,19 @@ unsigned GCNTTIImpl::getPrefetchDistance() const {
 bool GCNTTIImpl::shouldPrefetchAddressSpace(unsigned AS) const {
   return AMDGPU::isFlatGlobalAddrSpace(AS);
 }
+
+void GCNTTIImpl::collectLaunchBounds(
+    const Function &F,
+    SmallVectorImpl<std::pair<StringRef, int64_t>> &LB) const {
+  auto AmdgpuMaxNumWorkgroups = ST->getMaxNumWorkGroups(F);
+  LB.push_back({"AmdgpuMaxNumWorkgroupsX", AmdgpuMaxNumWorkgroups[0]});
+  LB.push_back({"AmdgpuMaxNumWorkgroupsY", AmdgpuMaxNumWorkgroups[1]});
+  LB.push_back({"AmdgpuMaxNumWorkgroupsZ", AmdgpuMaxNumWorkgroups[2]});
----------------
jdenny-ornl wrote:

Originally it was to have consistency among property names in kernel-info remarks. But that doesn't seem too important, especially now that this part of the code is outside kernel-info.  Done.

https://github.com/llvm/llvm-project/pull/102944


More information about the llvm-commits mailing list