[PATCH] D98143: [HIP] Diagnose aggregate args containing half types

Thu Sep 30 15:22:56 PDT 2021

tra added a comment.

In D98143#3034672 <https://reviews.llvm.org/D98143#3034672>, @yaxunl wrote:

> On gcc11 and below, since gcc does not support fp16, it is common practice to use short to pass fp16 in struct. Then gcc and clang has different ABI: https://godbolt.org/z/zqhT7x7qo
>
> Basically if one compiles a function with struct type arg containing _Float16 with clang, then compiles a caller with struct type arg containing short with gcc, it will not work.

If both compilers use the same type they both agree on, then there's no problem. If they pass data using different types but identical names, that's an ODR violation and it's not specific to fp16.

Given that fp16 ABI is not stable, the dignostics provided by the patch may be a useful safeguard to have.

That said, I'm not quite sure how one would use it in practice. Is that purely to check that fp16 is not passed around anywhere in the host code? 
If we compile everything with the check enabled that would potentially produce a lot of irrelevant/false-positive results. 
E.g. there may be legitimate host-only code passing fp16 parameters and used only from clang-compiled files.
Perhaps the check should be limited to externally visible functions only, and, maybe, make it a warning? 
We can not conclusively tell whether use of fp16 by the callee will be a problem if we don't know anything about the caller.

Are there specific cases where this particular issue popped up?
I think I did run into an issue with some ROCm headers that were using different fp16 types depending on how a file was compiled, even though in both cases compiler was clang. It's been a while ago, so I do not recall the exact details.
It might've been this code: https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/c560f436cea24a7c5b775042464bc4c2989744ca/library/include/internal/rocblas-types.h#L73

================
Comment at: clang/include/clang/Basic/LangOptions.def:256-258
+LANGOPT(HIPAllowHalfArg, 1, 1, "Allow half precision types or aggregate types "
+                               "containing half precision types as host "
+                               "function parameter and return types for HIP")
----------------
Nit: Maybe simplify it to `Allow using half precision types in host function parameter and return types.`

Also, this issue would technically affect CUDA, too, so GPUAllowHalfArg might work better.

================
Comment at: clang/lib/Sema/SemaDecl.cpp:8916-8917
+static void checkHIPFunctionParameters(Sema &S, FunctionDecl *FD) {
+  if (S.getLangOpts().HIPAllowHalfArg || FD->hasAttr<CUDADeviceAttr>() ||
+      FD->hasAttr<CUDAGlobalAttr>())
+    return;
----------------
I'd split it into separate early exits for the flag and for attribute checks, with an added comment that we only check host functions.

That brings the question of how we should handle HD functions. Right now they would not checked, which makes it possible to run into the issue you're trying to solve if that function is used on the host.

================
Comment at: clang/test/SemaCUDA/half-arg.cu:9-10
+
+// Check _Float16/__fp16 or structs containing them are not allowed as function
+// parameter in HIP host functions.
+
----------------
What happens if I have a lambda that captures an fp16 value? 

Granted, it will be inadvisable to pass across compilers, but, nevertheless it's another kind of an aggregate we may end up passing to a function.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98143/new/

https://reviews.llvm.org/D98143