[all-commits] [llvm/llvm-project] bc37be: LangRef: Add "dynamic" option to "denormal-fp-math"

Sat Apr 29 06:30:19 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bc37be1855773c1dcf8c6bf577a096a81fd58652
      https://github.com/llvm/llvm-project/commit/bc37be1855773c1dcf8c6bf577a096a81fd58652
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2023-04-29 (Sat, 29 Apr 2023)

  Changed paths:
    M clang/lib/CodeGen/CGCall.cpp
    M clang/lib/CodeGen/CodeGenAction.cpp
    M clang/lib/CodeGen/CodeGenModule.h
    M clang/test/CodeGen/denormalfpmode-f32.c
    M clang/test/CodeGen/denormalfpmode.c
    A clang/test/CodeGenCUDA/Inputs/ocml-sample.cl
    A clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu
    M clang/test/Driver/denormal-fp-math.c
    M llvm/docs/LangRef.rst
    M llvm/include/llvm/ADT/FloatingPointMode.h
    M llvm/include/llvm/Analysis/ConstantFolding.h
    M llvm/include/llvm/IR/Attributes.td
    M llvm/include/llvm/IR/Function.h
    M llvm/lib/Analysis/ConstantFolding.cpp
    M llvm/lib/CodeGen/CommandFlags.cpp
    M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
    M llvm/lib/IR/Attributes.cpp
    M llvm/lib/IR/Function.cpp
    M llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.h
    A llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll
    M llvm/test/CodeGen/X86/sqrt-fastmath.ll
    M llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll
    M llvm/test/Transforms/InstSimplify/canonicalize.ll
    M llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll
    M llvm/unittests/ADT/FloatingPointMode.cpp
    M llvm/utils/TableGen/Attributes.cpp

  Log Message:
  -----------
  LangRef: Add "dynamic" option to "denormal-fp-math"

This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.

There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.

Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.

Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.

I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.

If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.

This could also be of use for strictfp handling, where code may be
changing the denormal mode.

Alternative name could be "unknown".

Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.