[clang] 53e9698 - [NVPTX] Enable the _Float16 type for NVPTX compilation (#82436)

Tue Feb 20 16:12:31 PST 2024

Author: Joseph Huber
Date: 2024-02-20T18:12:27-06:00
New Revision: 53e96984b6dbb9d8ff55d2ccd0c27ffc1d27315f

URL: https://github.com/llvm/llvm-project/commit/53e96984b6dbb9d8ff55d2ccd0c27ffc1d27315f
DIFF: https://github.com/llvm/llvm-project/commit/53e96984b6dbb9d8ff55d2ccd0c27ffc1d27315f.diff

LOG: [NVPTX] Enable the _Float16 type for NVPTX compilation (#82436)

Summary:
The PTX target supports the f16 type natively and we alreaqdy have a few
LLVM backend tests that support the LLVM-IR. We should be able to enable
this for generic use. This is done prior the f16 math functions being
written in the GPU libc case.

Added: 
    

Modified: 
    clang/docs/LanguageExtensions.rst
    clang/lib/Basic/Targets/NVPTX.cpp
    clang/test/SemaCUDA/float16.cu

Removed: 
    


################################################################################
diff  --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index fb4d7a02dd086f..711baf45f449a0 100644

--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -833,6 +833,7 @@ to ``float``; see below for more information on this emulation.
   * 32-bit ARM (natively on some architecture versions)
   * 64-bit ARM (AArch64) (natively on ARMv8.2a and above)
   * AMDGPU (natively)
+  * NVPTX (natively)
   * SPIR (natively)
   * X86 (if SSE2 is available; natively if AVX512-FP16 is also available)
   * RISC-V (natively if Zfh or Zhinx is available)

diff  --git a/clang/lib/Basic/Targets/NVPTX.cpp b/clang/lib/Basic/Targets/NVPTX.cpp
index a8efae3a1ce388..b47c399fef6042 100644
--- a/clang/lib/Basic/Targets/NVPTX.cpp
+++ b/clang/lib/Basic/Targets/NVPTX.cpp
@@ -61,6 +61,10 @@ NVPTXTargetInfo::NVPTXTargetInfo(const llvm::Triple &Triple,
   NoAsmVariants = true;
   GPU = CudaArch::UNUSED;
 
+  // PTX supports f16 as a fundamental type.
+  HasLegalHalfType = true;
+  HasFloat16 = true;
+
   if (TargetPointerWidth == 32)
     resetDataLayout("e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64");
   else if (Opts.NVPTXUseShortPointers)

diff  --git a/clang/test/SemaCUDA/float16.cu b/clang/test/SemaCUDA/float16.cu
index a9cbe87f32c100..bb5ed606438491 100644
--- a/clang/test/SemaCUDA/float16.cu
+++ b/clang/test/SemaCUDA/float16.cu
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 -fsyntax-only -triple x86_64 -aux-triple amdgcn -verify %s
+// RUN: %clang_cc1 -fsyntax-only -triple x86_64 -aux-triple nvptx64 -verify %s
 // expected-no-diagnostics
 #include "Inputs/cuda.h"