[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 1 10:46:01 PDT 2022


jhuber6 added inline comments.


================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:9436
+        CGM.getModule(), Type, true,
+        llvm::GlobalValue::LinkageTypes::LinkOnceODRLinkage,
+        llvm::ConstantInt::get(Type, Value), Name, nullptr,
----------------
yaxunl wrote:
> jhuber6 wrote:
> > yaxunl wrote:
> > > jhuber6 wrote:
> > > > yaxunl wrote:
> > > > > yaxunl wrote:
> > > > > > jhuber6 wrote:
> > > > > > > yaxunl wrote:
> > > > > > > > This does not support per-TU control variables. Probably should use internal linkage.
> > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > > > > 
> > > > > > clang uses  -mlink-builtin-bitcode to link these device libraries for HIP and OpenCL. Then the linkage of these variables becomes internal linkage. That's why it works for per-TU control.
> > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was the most appropriate here. It should mean that we can have multiple identical definitions and they don't clash. There's also no requirement for these to be emitted as symbols AFAIK.
> > > > > > 
> > > > > > clang uses  -mlink-builtin-bitcode to link these device libraries for HIP and OpenCL. Then the linkage of these variables becomes internal linkage. That's why it works for per-TU control.
> > > > > 
> > > > > You may let HIP and OpenCL use internal linkage and C/C++/OpenMP use linkonce_odr since only HIP and OpenCL toolchain use -mlink-builtin-bitcode to link these device libraries 
> > > > I see, `linkonce_odr` implies that these should all have the same value which isn't necessarily true after linking. I'll change it to use private linkage.
> > > > 
> > > > OpenMP right now links everything late which means that we don't allow these to be defined differently per-TU. This may be incorrect given this new method as each TU will have different things set. I can change OpenMP to use the `mlink` method after this patch which may be more strictly correct.
> > > > I see, `linkonce_odr` implies that these should all have the same value which isn't necessarily true after linking. I'll change it to use private linkage.
> > > > 
> > > > OpenMP right now links everything late which means that we don't allow these to be defined differently per-TU. This may be incorrect given this new method as each TU will have different things set. I can change OpenMP to use the `mlink` method after this patch which may be more strictly correct.
> > > 
> > > On second thoughts, the idea for letting clang to emit these control variables might not work for HIP and OpenCL. The reason is that to support per-TU control variables, these variables need to be internal or private linkage, however, that means they cannot be used by other device library functions which are expecting non-internal linkage for them. Those device library functions will end up using control variables from device library bitcode any way.
> > > 
> > > For OpenMP, it may be necessary to emit them as linkonce_odr, otherwise device library functions may not find them.
> > > On second thoughts, the idea for letting clang to emit these control variables might not work for HIP and OpenCL. The reason is that to support per-TU control variables, these variables need to be internal or private linkage, however, that means they cannot be used by other device library functions which are expecting non-internal linkage for them. Those device library functions will end up using control variables from device library bitcode any way.
> > 
> > Right now we include each file per-TU using `-mlink-builtin-bitcode` which converts `linkonce_odr` to `private` linkage. Shouldn't this be equivalent? It may be possible to make some test showing a user of these constants to verify they get picked up correctly. If you're worried about these getting removed we may be able to stash them in `compiler.used`, that shouldn't impede the necessary constant propagation.
> > 
> > Side note, OpenCL seems to optimize these out without `-disable-llvm-optzns` while HIP will not. Does OpenCL use some mandatory passes to ensure that these control variables get handled? This method of using control constants in general is somewhat problematic as it hides invalid code behind some mandatory CP and DCE passes. For OpenMP right now we just generate one version for each architecture, which is wasteful but somewhat easier to work with.
> >  
> > > On second thoughts, the idea for letting clang to emit these control variables might not work for HIP and OpenCL. The reason is that to support per-TU control variables, these variables need to be internal or private linkage, however, that means they cannot be used by other device library functions which are expecting non-internal linkage for them. Those device library functions will end up using control variables from device library bitcode any way.
> > 
> > Right now we include each file per-TU using `-mlink-builtin-bitcode` which converts `linkonce_odr` to `private` linkage. Shouldn't this be equivalent? It may be possible to make some test showing a user of these constants to verify they get picked up correctly. If you're worried about these getting removed we may be able to stash them in `compiler.used`, that shouldn't impede the necessary constant propagation.
> > 
> 
> Let's assume the main program calls `foo()` and `foo()` uses a control variable `bar`. `foo()` is in a bitcode linked in by -mlink-builtin-bitcode. clang emits the control variable `bar` with private linkage in the main module. When clang tries to link `foo()`, it needs to resolve `bar`, but it cannot use the `bar` in the main module because `bar` has private linkage. Then `bar` becomes unresolved.
> 
> > Side note, OpenCL seems to optimize these out without `-disable-llvm-optzns` while HIP will not. Does OpenCL use some mandatory passes to ensure that these control variables get handled? This method of using control constants in general is somewhat problematic as it hides invalid code behind some mandatory CP and DCE passes. For OpenMP right now we just generate one version for each architecture, which is wasteful but somewhat easier to work with.
> >  
> 
> Are you using clang -cc1 without other options? There are LLVM passes by default, but they should not depend on language. You can see which pass is removing them by -mllvm -print-after-all.
> 
>Let's assume the main program calls foo() and foo() uses a control variable bar. foo() is in a bitcode linked in by -mlink-builtin-bitcode. clang emits the control variable bar with private linkage in the main module. When clang tries to link foo(), it needs to resolve bar, but it cannot use the bar in the main module because bar has private linkage. Then bar becomes unresolved.

This is a good point, we link only used definitions when using `-mlink-builtin-bitcode`. I think we link via `-mlink-builtin-bitcode` prior to running the backend, so this will be after we create these definitions. In this case we will import a definition like the following:
```
@__oclc_wavefrontsize64 = external local_unnamed_addr addrspace(4) constant i8, align 1
```
which has the same name, but cannot bind to the `private` variable. I think this is what `linkonce` linkage is supposed to provide, but I'm not overly familiar with the semantics.

> Are you using clang -cc1 without other options? There are LLVM passes by default, but they should not depend on language. You can see which pass is removing them by -mllvm -print-after-all.

The sample tests in this patch show the `-x hip` version does not require `-disable-llvm-optzns` while the `-x cl` version does.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130096/new/

https://reviews.llvm.org/D130096



More information about the cfe-commits mailing list