[all-commits] [llvm/llvm-project] 88a239: [AMDGPU] Adopt new lowering sequence for `fdiv16` ...
Shilei Tian via All-commits
all-commits at lists.llvm.org
Tue Oct 8 06:49:44 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 88a239d292da80f260788c0817a07cbc0a8ac758
https://github.com/llvm/llvm-project/commit/88a239d292da80f260788c0817a07cbc0a8ac758
Author: Shilei Tian <i at tianshilei.me>
Date: 2024-10-08 (Tue, 08 Oct 2024)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/frem.ll
M llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fdiv.mir
M llvm/test/CodeGen/AMDGPU/fdiv.f16.ll
M llvm/test/CodeGen/AMDGPU/fold-int-pow2-with-fmul-or-fdiv.ll
M llvm/test/CodeGen/AMDGPU/frem.ll
Log Message:
-----------
[AMDGPU] Adopt new lowering sequence for `fdiv16` (#109295)
The current lowering of `fdiv16` can generate incorrectly rounded result
in some cases. The new sequence was provided by the HW team, as shown
below written in C++.
```
half fdiv(half a, half b) {
float a32 = float(a);
float b32 = float(b);
float r32 = 1.0f / b32;
float q32 = a32 * r32;
float e32 = -b32 * q32 + a32;
q32 = e32 * r32 + q32;
e32 = -b32 * q32 + a32;
float tmp = e32 * r32;
uin32_t tmp32 = std::bit_cast<uint32_t>(tmp);
tmp32 = tmp32 & 0xff800000;
tmp = std::bit_cast<float>(tmp32);
q32 = tmp + q32;
half q16 = half(q32);
q16 = div_fixup_f16(q16);
return q16;
}
```
Fixes SWDEV-477608.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list