[libc-commits] [PATCH] D152486: [libc] Begin implementing a 'libmgpu.a' for math on th GPU

Thu Jun 8 17:34:21 PDT 2023

jhuber6 added a comment.

In D152486#4407279 <https://reviews.llvm.org/D152486#4407279>, @arsenm wrote:

> Specifically for ocml functions, we're really close to not requiring the internalization. D149715 <https://reviews.llvm.org/D149715> is the main piece I need to remove the last subtarget features.
>
> The wavesize is still a bit problematic. Ideally we would have separate wave32 and wave64 builds, and not allow mixing wavesizes in a single module. With un-wavesized library IR, you can kind of get away with relying on the global subtarget. (which I guess is the same problem you have with target-cpu)

I haven't seen the wavesize used in `ocml` fortunately. If we fix the requirement to perform attribute propagation the second issue is that all the device functions in the ROCm device libraries have `protected` visibility so LTO can't optimize them out without some hacks. Being able to link these in regularly would be nice since the only way I could think of to perform the correct attribute propagation was to re-run clang with the merged LTO bitcode. However long term it would be nice to have a `libm` on the GPU that lived upstream in this repository. We could potentially port the OpenCL in the ROCm device libraries, but I don't know how popular that would be internally at AMD.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152486/new/

https://reviews.llvm.org/D152486