[libc-commits] [PATCH] D152486: [libc] Begin implementing a 'libmgpu.a' for math on th GPU
Joseph Huber via Phabricator via libc-commits
libc-commits at lists.llvm.org
Thu Jun 8 17:34:21 PDT 2023
jhuber6 added a comment.
In D152486#4407279 <https://reviews.llvm.org/D152486#4407279>, @arsenm wrote:
> Specifically for ocml functions, we're really close to not requiring the internalization. D149715 <https://reviews.llvm.org/D149715> is the main piece I need to remove the last subtarget features.
>
> The wavesize is still a bit problematic. Ideally we would have separate wave32 and wave64 builds, and not allow mixing wavesizes in a single module. With un-wavesized library IR, you can kind of get away with relying on the global subtarget. (which I guess is the same problem you have with target-cpu)
I haven't seen the wavesize used in `ocml` fortunately. If we fix the requirement to perform attribute propagation the second issue is that all the device functions in the ROCm device libraries have `protected` visibility so LTO can't optimize them out without some hacks. Being able to link these in regularly would be nice since the only way I could think of to perform the correct attribute propagation was to re-run clang with the merged LTO bitcode. However long term it would be nice to have a `libm` on the GPU that lived upstream in this repository. We could potentially port the OpenCL in the ROCm device libraries, but I don't know how popular that would be internally at AMD.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152486/new/
https://reviews.llvm.org/D152486
More information about the libc-commits
mailing list