[all-commits] [llvm/llvm-project] e7ad07: [libclc] Move fma to the CLC library (#126052)
Fraser Cormack via All-commits
all-commits at lists.llvm.org
Mon Feb 24 02:11:13 PST 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: e7ad07ffb846a9812d9567b8d4b680045dce5b28
https://github.com/llvm/llvm-project/commit/e7ad07ffb846a9812d9567b8d4b680045dce5b28
Author: Fraser Cormack <fraser at codeplay.com>
Date: 2025-02-24 (Mon, 24 Feb 2025)
Changed paths:
M libclc/CMakeLists.txt
A libclc/clc/include/clc/internal/math/clc_sw_fma.h
A libclc/clc/include/clc/math/clc_fma.h
M libclc/clc/include/clc/math/math.h
A libclc/clc/lib/clspv/SOURCES
A libclc/clc/lib/clspv/math/clc_sw_fma.cl
M libclc/clc/lib/generic/SOURCES
A libclc/clc/lib/generic/math/clc_fma.cl
A libclc/clc/lib/generic/math/clc_fma.inc
A libclc/clc/lib/generic/math/clc_sw_fma.cl
A libclc/clc/lib/spirv/SOURCES
A libclc/clc/lib/spirv/math/clc_runtime_has_hw_fma32.cl
M libclc/clspv/lib/math/fma.cl
R libclc/generic/include/math/clc_fma.h
M libclc/generic/lib/SOURCES
M libclc/generic/lib/math/clc_exp10.cl
R libclc/generic/lib/math/clc_fma.cl
M libclc/generic/lib/math/clc_fmod.cl
M libclc/generic/lib/math/clc_hypot.cl
M libclc/generic/lib/math/clc_pow.cl
M libclc/generic/lib/math/clc_pown.cl
M libclc/generic/lib/math/clc_powr.cl
M libclc/generic/lib/math/clc_remainder.cl
M libclc/generic/lib/math/clc_remquo.cl
M libclc/generic/lib/math/clc_rootn.cl
M libclc/generic/lib/math/fma.cl
R libclc/generic/lib/math/fma.inc
M libclc/generic/lib/math/sincos_helpers.cl
M libclc/spirv/lib/SOURCES
M libclc/spirv/lib/math/fma.cl
R libclc/spirv/lib/math/fma.inc
Log Message:
-----------
[libclc] Move fma to the CLC library (#126052)
This builtin is a little more involved than others as targets deal with
fma in various different ways.
Fundamentally, the CLC __clc_fma builtin compiles to
__builtin_elementwise_fma, which compiles to the @llvm.fma intrinsic.
However, in the case of fp32 fma some targets call the __clc_sw_fma
function, which provides a software implementation of the builtin. This
in principle is controlled by the __CLC_HAVE_HW_FMA32 macro and may be a
runtime decision, depending on how the target defines that macro.
All targets build the CLC fma functions for all types. This is to the
CLC library can have a reliable internal implementation for its own
purposes.
For AMD/NVPTX targets there are no meaningful changes to the generated
LLVM bytecode. Some blocks of code have moved around, which confounds
llvm-diff.
For the clspv and SPIR-V/Mesa targets, only fp32 fma is of interest. Its
use in libclc is tightly controlled by checking __CLC_HAVE_HW_FMA32
first. This can either be a compile-time constant (1, for clspv) or a
runtime function for SPIR-V/Mesa.
The SPIR-V/Mesa target only provided fp32 fma in the OpenCL layer. It
unconditionally mapped that to the __clc_sw_fma builtin, even though the
generic version in theory had a runtime toggle through
__CLC_HAVE_HW_FMA32 specifically for that target. Callers of fma,
though, would end up using the ExtInst fma, *not* calling the _Z3fmafff
function provided by libclc.
This commit keeps this system in place in the OpenCL layer, by mapping
fma to __clc_sw_fma. Where other builtins would previously call fma
(i.e., result in the ExtInst), they now call __clc_fma. This function
checks the __CLC_HAVE_HW_FMA32 runtime toggle, which selects between the
slow version or the quick version. The quick version is the LLVM fma
intrinsic which llvm-spirv translates to the ExtInst.
The clspv target had its own software implementation of fp32 fma, which
it called unconditionally - even though __CLC_HAVE_HW_FMA32 is 1 for
that target. This is potentially just so its library ships a software
version which it can fall back on. In the OpenCL layer, the target
doesn't provide fp64 fma, and maps fp16 fma to fp32 mad.
This commit keeps this system roughly in place: in the OpenCL layer it
maps fp32 fma to __clc_sw_fma, and fp16 fma to mad. Where builtins would
previously call into fma, they now call __clc_fma, which compiles to the
LLVM intrinsic. If this goes through a translation to SPIR-V it will
become the fma ExtInst, or the intrinsic could be replaced by the
_Z3fmafff software implementation.
The clspv and SPIR-V/Mesa targets could potentially be cleaned up later,
depending on their needs.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list