[PATCH] D122734: [CUDA][HIP] Fix mangling number for local struct
Yaxun Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 22 20:26:30 PDT 2022
yaxunl marked 7 inline comments as done.
yaxunl added inline comments.
================
Comment at: clang/include/clang/AST/ASTContext.h:682
+ /// Current name mangling is for device name in host compilation.
+ bool MangleDeviceNameInHostCompilation = false;
+ } CUDANameMangleCtx;
----------------
rnk wrote:
> It doesn't feel right to store mutable state here in ASTContext. You are effectively using this as a global variable to thread a boolean through to `ASTContext::getManglingNumber`, which is called from the mangler. The boolean indicates of host or device mangling numbers are desired for this particular mangling. Is that more or less correct?
>
> This makes me think that it was perhaps a mistake to store mangling numbers in the AST Context in the first place. If we stored them in the mangling context instead, I think we'd get the right behavior out of the box. Is that correct? It it feasible to make that change?
> It doesn't feel right to store mutable state here in ASTContext. You are effectively using this as a global variable to thread a boolean through to `ASTContext::getManglingNumber`, which is called from the mangler. The boolean indicates of host or device mangling numbers are desired for this particular mangling. Is that more or less correct?
>
> This makes me think that it was perhaps a mistake to store mangling numbers in the AST Context in the first place. If we stored them in the mangling context instead, I think we'd get the right behavior out of the box. Is that correct? It it feasible to make that change?
It is possible but I feel it is too invasive. Mangling numbers need to be calculated during construction of AST since it is context dependent. Mangle contexts are created in ABI of codegen. To store mangling numbers in mangling contexts, we need to move mangling contexts from ABI to ASTContext. We also need to add device mangling context in ASTContext. Then for each use of get/set mangling number API, we need to add get/set device mangling number API. That would need lots of changes.
I would see the state added to ASTContext by my approach from the perspective of single source language. The program contains both device code and host code. Ideally the compiler should be able to compile the device code with device target and ABI and compile the host code with host target ABI and switch the target freely during codegen. When we try to get the device side mangled name in host compilation, what we trying to do is switch to device target momoentorily during the mangling. This needs a state in ASTContext indicating that we are doing mangling for device instead of host, therefore getManglingNumber should return device mangling number instead of host mangling number.
================
Comment at: clang/include/clang/AST/ASTContext.h:684
+ } CUDANameMangleCtx;
+ struct CUDANameMangleContextRAII {
+ ASTContext &Ctx;
----------------
rnk wrote:
> By the way, this looks like `SaveAndRestore<bool>`. Maybe you can use that in the future.
Yes. SaveAndRestore serves the purpose. will use it.
================
Comment at: clang/lib/AST/ASTContext.cpp:11760
+
+ auto Cutoff = [](unsigned V) { return V > 1 ? V : 1; };
+ if (CUDANameMangleCtx.MangleDeviceNameInHostCompilation)
----------------
tra wrote:
> rnk wrote:
> > rnk wrote:
> > > tra wrote:
> > > > I think we should change the `MangleNumbers` instead to use uin64_t or, perhaps `unsigned[2]` or even `struct{ unsigned host; unsigned device;}`.
> > > > Reducing the max number of anything to just 64K will be prone to failing sooner or later.
> > > IMO it would be simpler to set up a ternary instead of a lambda and two returns:
> > > Res = Cond ? (Res >> 16) : (Res & 0xFFFF);
> > > return Res > 1 ? Res : 1;
> > I suspect we could live with a 64K mangling number limit, but that is low enough that we need to put in a check for overflow somewhere.
> If we'd produce a reasonable diagnostic for that, it might work.
>
>
will do
================
Comment at: clang/lib/AST/ASTContext.cpp:11760-11763
+ auto Cutoff = [](unsigned V) { return V > 1 ? V : 1; };
+ if (CUDANameMangleCtx.MangleDeviceNameInHostCompilation)
+ return Cutoff(Res >> 16);
+ return Cutoff(Res & 0xffff);
----------------
yaxunl wrote:
> tra wrote:
> > rnk wrote:
> > > rnk wrote:
> > > > tra wrote:
> > > > > I think we should change the `MangleNumbers` instead to use uin64_t or, perhaps `unsigned[2]` or even `struct{ unsigned host; unsigned device;}`.
> > > > > Reducing the max number of anything to just 64K will be prone to failing sooner or later.
> > > > IMO it would be simpler to set up a ternary instead of a lambda and two returns:
> > > > Res = Cond ? (Res >> 16) : (Res & 0xFFFF);
> > > > return Res > 1 ? Res : 1;
> > > I suspect we could live with a 64K mangling number limit, but that is low enough that we need to put in a check for overflow somewhere.
> > If we'd produce a reasonable diagnostic for that, it might work.
> >
> >
> will do
mangling number is member of some AST nodes. Increasing its size may increase overall AST memory usage and PCH size. I will add diagnostic for mangling number exceeding limit. If it happens in real applications I will revisit this.
================
Comment at: clang/test/CodeGenCUDA/struct-mangling-number.cu:25
+// CHECK: define amdgpu_kernel void @[[KERN:_Z6kernelIZN4TestIiE3runEvE2OpEvv]](
+// CHECK: @{{.*}} = {{.*}}c"[[KERN]]\00"
+
----------------
tra wrote:
> Using HOST/DEV prefixes would work better, IMO to tell what's expected to happen where.
will do
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D122734/new/
https://reviews.llvm.org/D122734
More information about the cfe-commits
mailing list