[PATCH] D99347: [AMDGPU] Set implicit arg attributes for indirect calls
Madhur Amilkanthwar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 25 10:17:42 PDT 2021
madhur13490 added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAnnotateKernelFeatures.cpp:275
+
+ // If this function hasAddressTaken() = true
+ // then add all attributes corresponding to the implicit args.
----------------
JonChesterfield wrote:
> Is the reasoning here that an indirect call site to an unknown callee will have set up all the implicit state, so for consistency a function which is ever called indirectly must always have all the state set up?
>
> I would have hoped the granularity is such that functions can be annotated with only what they need, so that indirect call sites pass (and require) everything but direct call sites don't need to. With consistent register location for implicit values, I would hope that will work out of the box.
> Is the reasoning here that an indirect call site to an unknown callee will have set up all the implicit state, so for consistency a function which is ever called indirectly must always have all the state set up?
We are not setting up implicit attributes per call site. This pass works at function granularity and it does not take into account the call site. Implicit attributes are not a property of call site but the functions themselves. If function's hasAddressTaken() is true, we set all attributes corresponding to the implicits.
> I would have hoped the granularity is such that functions can be annotated with only what they need, so that indirect call sites pass (and require) everything but direct call sites don't need to. With consistent register location for implicit values, I would hope that will work out of the box.
That won't work out. You can't ask call site to pass all implicits when you setup only a subset of them. The arguments setup must be at superset of what is passed. We use attributes on the function (not call site) to setup implicits and thus pass this pass sets them up before we begin codegen so that codegen works out-of-the-box. So your comment - " that functions can be annotated with only what they need" is invalid.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAnnotateKernelFeatures.cpp:298
+ if (!CB->isInlineAsm()) {
+ HasIndirectCall = true;
HaveCall = true;
----------------
JonChesterfield wrote:
> As discussed, presence of inline asm does not need to be taken to imply presence of an indirect call within said asm. If someone wants to make an indirect branch from asm, it's on them to get it right. If someone wants to use asm without indirect branches, they won't want the extra overhead from conservatively setting up all the state.
The code is setting the flag when there is *noinlineasm()*. If you set the flag even when there is inlineAsm() then all exisiting apps which are using inlieAsm() would be requesting more implicits than current ones. This is what we discussed and aligned internally.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAnnotateKernelFeatures.cpp:382
+ // essential for correctness).
+ if (CallingConvSupportsAllImplicits && HasIndirectCall) {
+ for (StringRef AttrName : ImplicitAttrNames) {
----------------
JonChesterfield wrote:
> This doesn't make sense to me.
>
> An indirect call site implies all implicit args are needed, so the caller of that function will acquire those attributes. Then a direct call to said function will have those attributes, and so on.
>
> hasAddressTaken on the kernel itself makes no difference here - is the comment out of date?
I don't understand this comment.
> This doesn't make sense to me.
>
> An indirect call site implies all implicit args are needed, so the caller of that function will acquire those attributes. Then a direct call to said function will have those attributes, and so on.
which function?
>
> hasAddressTaken on the kernel itself makes no difference here - is the comment out of date?
hasAddressTaken() on kernels makes difference because hasAddressTaken() for kernels is naturally false. How would set implicits then? This snippet is explicitly for functions which makes an indirect call and not for the address taken functions (that is handled at the beginning).
The comment is not out of date. Look at the test cases - simple-indirect-call.ll and direct-indirect-call.ll.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99347/new/
https://reviews.llvm.org/D99347
More information about the llvm-commits
mailing list