[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

Tue Aug 29 12:37:47 PDT 2023

jhuber6 added a comment.

In D158778#4625892 <https://reviews.llvm.org/D158778#4625892>, @tra wrote:

> In D158778#4624408 <https://reviews.llvm.org/D158778#4624408>, @ABataev wrote:
>
>> Just checks removal should be fine
>
> Looks like OpenMP handles long double and __float128 differently -- it always insists on using the host's FP format for both.
> https://github.com/llvm/llvm-project/blob/d037445f3a2c6dc1842b5bfc1d5d81988c2f223d/clang/lib/AST/ASTContext.cpp#L1674
>
> This creates a divergence between what clang thinks and what LLVM can handle.
> I'm not quite sure how it's supposed to work with NVPTX or AMDGPU, where we demote those types to double and can't generate code for the actual types.
>
> @jhuber6 what does OpenMP expect to happen for those types on the GPU side?

That's a good question, I'm not entirely sure what the expectation would be. We obviously need to keep things coherent across D2H and H2D memcpy's so we want them to be the same size. I'm pretty sure our handling of this is just wrong right now. Just doing a simple example here https://godbolt.org/z/Y3E58PKMz shows that for NVPTX we error out (as I would expect) but for AMDGPU we emit an x86 80-bit double. My guess is that we should make this more explicit, considering that both vendors explicitly state that quad precision is not available on the GPU, unless we want to implement some software floats.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158778/new/

https://reviews.llvm.org/D158778