[PATCH] D154517: AMDGPU: Always use v_rcp_f16 and v_rsq_f16
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 14 04:08:46 PDT 2023
arsenm added a comment.
In D154517#4478712 <https://reviews.llvm.org/D154517#4478712>, @arsenm wrote:
> In D154517#4476671 <https://reviews.llvm.org/D154517#4476671>, @foad wrote:
>
>>> Brute force produces identical values compared to a reference host implementation for all values.
>>
>> Have you tested v_sqrt_f16 or any other f16 trans instructions?
>
> Haven't gotten there yet
v_sqrt_f16 is identical.
v_log_f16 is is identical.
v_exp_f16 has a single value differ: ref=0x1p+0 inst=0x1.004p+0
I'm also comparing these by cast to float host implementations, maybe a proper f16 implementation would have rounded these differences differently?
I think I'm doing something wrong with the pre-scaling for sin/cos, those results just seem totally wrong
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154517/new/
https://reviews.llvm.org/D154517
More information about the llvm-commits
mailing list