[PATCH] D84068: AMDGPU/clang: Search resource directory for device libraries

Mon Aug 10 14:15:03 PDT 2020

tra added a subscriber: echristo.
tra added a comment.

In D84068#2204713 <https://reviews.llvm.org/D84068#2204713>, @arsenm wrote:

>> If we ship them with clang, who/where/how builds them?
>> If they come from ROCm packages, how would those packages add stuff into *clang* install directory? Resource dir is a rather awkward location if contents may be expected to change routinely.
>
> Symlinks. I've been building the device libraries as part of LLVM_EXTERNAL_PROJECTS, and think this should be the preferred way to build and package the libraries. This is how compiler-rt is packaged on linux distributions. The compiler-rt binaries are a separate package symlinked into the resource directory locations. I'm not sure what you mean exactly by change routinely, the libraries should be an implementation detail invisible to users, not something they should be directly relying on. Only clang actually knows how to use them correctly and every other user is buggy
>
>> What if I have multiple ROCm versions installed? Which one should provide the bitcode in the resource dir?
>
> These should be treated as an integral part of clang, and not something to mix and match. Each rocm version should have its own copy of the device libraries. It only happens to work most of the time if you mismatch these, and this isn't a guaranteed property.

I'm still not sure how that's going to work. We have `M clang versions`:`N ROCm versions` relationship here. 
If I have one clang version, but want to do two different builds, one with ROCm-X and one with ROCm-Y, how would I do that? It sounds like I'll need to have multiple clang installation variants.

Similarly, if I have multiple clang versions installed, how would ROCm know which of those clang installations must be updated?

What if I install yet another clang version *after* ROCm has been installed, how will ROCm package know that it needs up update yet another clang installation.

This will get rather unmanageable as soon as you get beyond the "I only have one clang and one ROCm version" scenario.

I think it would make much more sense for clang to treat ROCm's bits as an external dependency, similar to CUDA. Be definition clang/llvm does not control anything outside of its own packages. While ROCm is AMD's package, I'm willing to bet that eventually various Linux distros will start shuffling its bits around the same way it happened to CUDA.

>> As long as explicitly specified `--hip-device-lib-path` can still point to the right path, it's probably OK, but it all adds some confusion about who controls which parts of the HIP compilation and how it all is supposed to work in cases that deviate from the default assumptions.
>
> Long term I would rather get rid of --hip-device-lib-path, and only use the standard -resource_dir flags

Please, please, please keep explicit path option. There are real use cases where you can not expect ROCm to be installed anywhere 'standard'. Or at all. 
Imagine people embedding libclang into their GUI/tools. There's no resource directory. There may be no ROCm installation, or it may not be possible due to lack of privileges.

I short, I think that tightly coupling clang's expectations to a non-clang project is not a good idea.
Summoning @echristo for a second opinion.

>> It would help if the requirements would be documented somewhere.
>
> Documentation would be good, but the problem I always have is deciding where "somewhere" is

clang/docs ?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84068/new/

https://reviews.llvm.org/D84068