[llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

Jun Wang via llvm-commits llvm-commits at lists.llvm.org
Sun Jun 16 17:18:40 PDT 2024


================
@@ -677,6 +693,33 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
     return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
                                               UsedAssumedInformation);
   }
+
+  // Returns true if FlatScratchInit is needed, i.e., no-flat-scratch-init is
+  // not to be set.
+  bool needFlatScratchInit(Attributor &A) {
+    // This is called on each callee; false means callee shouldn't have
+    // no-flat-scratch-init.
+    auto CheckForNoFlatScratchInit = [&](Instruction &I) {
+      const auto &CB = cast<CallBase>(I);
+      const Value *CalleeOp = CB.getCalledOperand();
+      const Function *Callee = dyn_cast<Function>(CalleeOp);
----------------
jwanggit86 wrote:

There's a subtle difference between using `getCalledFuunction()` and using `getCalledOperand()`. In the following example,
```
define i32 @use_dispatch_ptr_ret_type() #1 {
  %dispatch.ptr = call ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
  store volatile ptr addrspace(4) %dispatch.ptr, ptr addrspace(1) undef
  ret i32 0
}

define float @func_indirect_use_dispatch_ptr_constexpr_cast_func() #1 {
  %f = call float @use_dispatch_ptr_ret_type()
  %fadd = fadd float %f, 1.0
  ret float %fadd
}
```
Note that callee's return type is i32 but caller casts it to float. Because of this, `getCalledFunction()` would return nullptr and eventually lead to incorrect analysis result.

https://github.com/llvm/llvm-project/pull/94647


More information about the llvm-commits mailing list