[all-commits] [llvm/llvm-project] 71be91: HIP: Directly call rint builtins
Matt Arsenault via All-commits
all-commits at lists.llvm.org
Tue Jul 25 04:54:27 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 71be91eba96d80d15689e4f516141c533c3c086d
https://github.com/llvm/llvm-project/commit/71be91eba96d80d15689e4f516141c533c3c086d
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2023-07-25 (Tue, 25 Jul 2023)
Changed paths:
M clang/lib/Headers/__clang_hip_math.h
M clang/test/Headers/__clang_hip_math.hip
Log Message:
-----------
HIP: Directly call rint builtins
Commit: 47b3ada432f8afee9723a4b3d27b3efbef34dedf
https://github.com/llvm/llvm-project/commit/47b3ada432f8afee9723a4b3d27b3efbef34dedf
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2023-07-25 (Tue, 25 Jul 2023)
Changed paths:
M llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll
M llvm/test/CodeGen/AMDGPU/rsq.f64.ll
Log Message:
-----------
AMDGPU: Add more sqrt f64 lowering tests
Almost all permutations of the flags are potentially relevant.
Commit: e3fd8f83a801b1918508c7c0a71cc31bc95ad4d2
https://github.com/llvm/llvm-project/commit/e3fd8f83a801b1918508c7c0a71cc31bc95ad4d2
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2023-07-25 (Tue, 25 Jul 2023)
Changed paths:
M llvm/docs/AMDGPUUsage.rst
M llvm/docs/ReleaseNotes.rst
M llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.h
M llvm/lib/Target/AMDGPU/VOP1Instructions.td
M llvm/test/Analysis/CostModel/AMDGPU/arith-fp.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir
M llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll
M llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll
M llvm/test/CodeGen/AMDGPU/rsq.f64.ll
Log Message:
-----------
AMDGPU: Correctly expand f64 sqrt intrinsic
rocm-device-libs and llpc were avoiding using f64 sqrt
intrinsics in favor of their own expansions. Port the
expansion into the backend. Both of these users should be
updated to call the intrinsic instead.
The library and llpc expansions are slightly different.
llpc uses an ldexp to do the scale; the library uses a multiply.
Use ldexp to do the scale instead of the multiply.
I believe v_ldexp_f64 and v_mul_f64 are always the same number of
cycles, but it's cheaper to materialize the 32-bit integer constant
than the 64-bit double constant.
The libraries have another fast version of sqrt which will
be handled separately.
I am tempted to do this in an IR expansion instead. In the IR
we could take advantage of computeKnownFPClass to avoid
the 0-or-inf argument check.
Commit: 395cd33ba850989209834a2e332d21b42168cfaf
https://github.com/llvm/llvm-project/commit/395cd33ba850989209834a2e332d21b42168cfaf
Author: Matt Arsenault <Matthew.Arsenault at amd.com>
Date: 2023-07-25 (Tue, 25 Jul 2023)
Changed paths:
M llvm/docs/AMDGPUUsage.rst
Log Message:
-----------
AMDGPU: Remove trailing whitespace from documentation
Compare: https://github.com/llvm/llvm-project/compare/d031ff38779b...395cd33ba850
More information about the All-commits
mailing list