[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang
Joseph Huber via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Aug 30 08:06:20 PDT 2022
jhuber6 added inline comments.
================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:9436
+ CGM.getModule(), Type, true,
+ llvm::GlobalValue::LinkageTypes::LinkOnceODRLinkage,
+ llvm::ConstantInt::get(Type, Value), Name, nullptr,
----------------
yaxunl wrote:
> jhuber6 wrote:
> > yaxunl wrote:
> > > yaxunl wrote:
> > > > jhuber6 wrote:
> > > > > yaxunl wrote:
> > > > > > This does not support per-TU control variables. Probably should use internal linkage.
> > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > >
> > > > clang uses -mlink-builtin-bitcode to link these device libraries for HIP and OpenCL. Then the linkage of these variables becomes internal linkage. That's why it works for per-TU control.
> > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > >
> > > > clang uses -mlink-builtin-bitcode to link these device libraries for HIP and OpenCL. Then the linkage of these variables becomes internal linkage. That's why it works for per-TU control.
> > >
> > > You may let HIP and OpenCL use internal linkage and C/C++/OpenMP use linkonce_odr since only HIP and OpenCL toolchain use -mlink-builtin-bitcode to link these device libraries
> > I see, `linkonce_odr` implies that these should all have the same value which isn't necessarily true after linking. I'll change it to use private linkage.
> >
> > OpenMP right now links everything late which means that we don't allow these to be defined differently per-TU. This may be incorrect given this new method as each TU will have different things set. I can change OpenMP to use the `mlink` method after this patch which may be more strictly correct.
> > I see, `linkonce_odr` implies that these should all have the same value which isn't necessarily true after linking. I'll change it to use private linkage.
> >
> > OpenMP right now links everything late which means that we don't allow these to be defined differently per-TU. This may be incorrect given this new method as each TU will have different things set. I can change OpenMP to use the `mlink` method after this patch which may be more strictly correct.
>
> On second thoughts, the idea for letting clang to emit these control variables might not work for HIP and OpenCL. The reason is that to support per-TU control variables, these variables need to be internal or private linkage, however, that means they cannot be used by other device library functions which are expecting non-internal linkage for them. Those device library functions will end up using control variables from device library bitcode any way.
>
> For OpenMP, it may be necessary to emit them as linkonce_odr, otherwise device library functions may not find them.
> On second thoughts, the idea for letting clang to emit these control variables might not work for HIP and OpenCL. The reason is that to support per-TU control variables, these variables need to be internal or private linkage, however, that means they cannot be used by other device library functions which are expecting non-internal linkage for them. Those device library functions will end up using control variables from device library bitcode any way.
Right now we include each file per-TU using `-mlink-builtin-bitcode` which converts `linkonce_odr` to `private` linkage. Shouldn't this be equivalent? It may be possible to make some test showing a user of these constants to verify they get picked up correctly. If you're worried about these getting removed we may be able to stash them in `compiler.used`, that shouldn't impede the necessary constant propagation.
Side note, OpenCL seems to optimize these out without `-disable-llvm-optzns` while HIP will not. Does OpenCL use some mandatory passes to ensure that these control variables get handled? This method of using control constants in general is somewhat problematic as it hides invalid code behind some mandatory CP and DCE passes. For OpenMP right now we just generate one version for each architecture, which is wasteful but somewhat easier to work with.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130096/new/
https://reviews.llvm.org/D130096
More information about the cfe-commits
mailing list