[PATCH] D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set

Thu Jan 23 11:51:18 PST 2020

cfang marked an inline comment as done.
cfang added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/fdiv.f16.ll:253
+
+!0 = !{float 2.500000e+00}
----------------
arsenm wrote:
> arsenm wrote:
> > I don't know what ulp the f16 rcp instruction provides. This test change looks incomplete if there isn't already a case without !fpmath
> I found a document stating this provides "~0.5ulp", so I guess check that value for f16?
Currently the logic in DAG lowering  does "1/x -> rcp(x)" for fp16 without checking fpmath accuracy.
Actually it always does "1/x -> rcp(x)" for fp16 because v_rcp_f16 supports denormals.

We need to revisit that logic in DAG lowering. But I would rather to do that in a follow-up patch.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71293/new/

https://reviews.llvm.org/D71293