[llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)
Jun Wang via llvm-commits
llvm-commits at lists.llvm.org
Sun Jun 16 17:18:40 PDT 2024
================
@@ -677,6 +693,33 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e., no-flat-scratch-init is
+ // not to be set.
+ bool needFlatScratchInit(Attributor &A) {
+ // This is called on each callee; false means callee shouldn't have
+ // no-flat-scratch-init.
+ auto CheckForNoFlatScratchInit = [&](Instruction &I) {
+ const auto &CB = cast<CallBase>(I);
+ const Value *CalleeOp = CB.getCalledOperand();
+ const Function *Callee = dyn_cast<Function>(CalleeOp);
----------------
jwanggit86 wrote:
There's a subtle difference between using `getCalledFuunction()` and using `getCalledOperand()`. In the following example,
```
define i32 @use_dispatch_ptr_ret_type() #1 {
%dispatch.ptr = call ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
store volatile ptr addrspace(4) %dispatch.ptr, ptr addrspace(1) undef
ret i32 0
}
define float @func_indirect_use_dispatch_ptr_constexpr_cast_func() #1 {
%f = call float @use_dispatch_ptr_ret_type()
%fadd = fadd float %f, 1.0
ret float %fadd
}
```
Note that callee's return type is i32 but caller casts it to float. Because of this, `getCalledFunction()` would return nullptr and eventually lead to incorrect analysis result.
https://github.com/llvm/llvm-project/pull/94647
More information about the llvm-commits
mailing list