[PATCH] D144911: adding bf16 support to NVPTX

Mon Jun 26 16:49:45 PDT 2023

tra updated this revision to Diff 534787.
tra added a comment.

Changed clang builtins signatures to use __bf16.

Fixed an ancient error in lowering f16/bf16 -- the requirement on 4-byte minimum
argument and return type size only applies to the non-ABI `.reg` arguments which
we do not actually use. While at that, lower f16/bf16 arguments as a byte array,
to match what NVCC and clang generate for CUDA's __half and __nv_bfloat16 types.

While technically `.b16` may be the right choice for `__bf16`, Most of the CUDA
code would rely on CUDA-provided __half which is an aggregate and is lowered as
a byte array. While there should be no functional difference beetween `.b16` and
`.b6 .align 2 array[2]`, keeping PTX function signatures identical to the code
which may be generated by NVCC, is useful to keep ABI between PTX generated from
CUDA sources and PTX generated from LLVM IR in sync.

More tests for bf16 uses on pre-sm_90 GPUs.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144911/new/

https://reviews.llvm.org/D144911

Files:
  clang/include/clang/Basic/BuiltinsNVPTX.def
  clang/test/CodeGen/builtins-nvptx.c
  clang/test/CodeGenCUDA/bf16.cu
  llvm/include/llvm/IR/IntrinsicsNVVM.td
  llvm/lib/IR/AutoUpgrade.cpp
  llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp
  llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
  llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.h
  llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
  llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
  llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
  llvm/lib/Target/NVPTX/NVPTXMCExpr.cpp
  llvm/lib/Target/NVPTX/NVPTXMCExpr.h
  llvm/lib/Target/NVPTX/NVPTXSubtarget.cpp
  llvm/lib/Target/NVPTX/NVPTXSubtarget.h
  llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
  llvm/test/CodeGen/NVPTX/bf16-instructions.ll
  llvm/test/CodeGen/NVPTX/convert-sm80.ll
  llvm/test/CodeGen/NVPTX/f16-instructions.ll
  llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-autoupgrade.ll
  llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70.ll
  llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72-autoupgrade.ll
  llvm/test/CodeGen/NVPTX/math-intrins-sm86-ptx72.ll
  llvm/test/CodeGen/NVPTX/param-load-store.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D144911.534787.patch
Type: text/x-patch
Size: 117662 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20230626/16d07efb/attachment-0001.bin>