[all-commits] [llvm/llvm-project] 71be91: HIP: Directly call rint builtins

Tue Jul 25 04:54:27 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 71be91eba96d80d15689e4f516141c533c3c086d
      https://github.com/llvm/llvm-project/commit/71be91eba96d80d15689e4f516141c533c3c086d
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2023-07-25 (Tue, 25 Jul 2023)

  Changed paths:
    M clang/lib/Headers/__clang_hip_math.h
    M clang/test/Headers/__clang_hip_math.hip

  Log Message:
  -----------
  HIP: Directly call rint builtins

  Commit: 47b3ada432f8afee9723a4b3d27b3efbef34dedf
      https://github.com/llvm/llvm-project/commit/47b3ada432f8afee9723a4b3d27b3efbef34dedf
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2023-07-25 (Tue, 25 Jul 2023)

  Changed paths:
    M llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll
    M llvm/test/CodeGen/AMDGPU/rsq.f64.ll

  Log Message:
  -----------
  AMDGPU: Add more sqrt f64 lowering tests

Almost all permutations of the flags are potentially relevant.

  Commit: e3fd8f83a801b1918508c7c0a71cc31bc95ad4d2
      https://github.com/llvm/llvm-project/commit/e3fd8f83a801b1918508c7c0a71cc31bc95ad4d2
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2023-07-25 (Tue, 25 Jul 2023)

  Changed paths:
    M llvm/docs/AMDGPUUsage.rst
    M llvm/docs/ReleaseNotes.rst
    M llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
    M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
    M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
    M llvm/lib/Target/AMDGPU/SIISelLowering.h
    M llvm/lib/Target/AMDGPU/VOP1Instructions.td
    M llvm/test/Analysis/CostModel/AMDGPU/arith-fp.ll
    M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir
    M llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll
    M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll
    M llvm/test/CodeGen/AMDGPU/rsq.f64.ll

  Log Message:
  -----------
  AMDGPU: Correctly expand f64 sqrt intrinsic

rocm-device-libs and llpc were avoiding using f64 sqrt
intrinsics in favor of their own expansions. Port the
expansion into the backend. Both of these users should be
updated to call the intrinsic instead.

The library and llpc expansions are slightly different.
llpc uses an ldexp to do the scale; the library uses a multiply.

Use ldexp to do the scale instead of the multiply.
I believe v_ldexp_f64 and v_mul_f64 are always the same number of
cycles, but it's cheaper to materialize the 32-bit integer constant
than the 64-bit double constant.

The libraries have another fast version of sqrt which will
be handled separately.

I am tempted to do this in an IR expansion instead. In the IR
we could take advantage of computeKnownFPClass to avoid
the 0-or-inf argument check.

  Commit: 395cd33ba850989209834a2e332d21b42168cfaf
      https://github.com/llvm/llvm-project/commit/395cd33ba850989209834a2e332d21b42168cfaf
  Author: Matt Arsenault <Matthew.Arsenault at amd.com>
  Date:   2023-07-25 (Tue, 25 Jul 2023)

  Changed paths:
    M llvm/docs/AMDGPUUsage.rst

  Log Message:
  -----------
  AMDGPU: Remove trailing whitespace from documentation

Compare: https://github.com/llvm/llvm-project/compare/d031ff38779b...395cd33ba850