[PATCH] D28508: [NVPTX] Lower to sqrt.approx and rsqrt.approx under more circumstances.

Thu Jan 12 19:10:28 PST 2017

jlebar added a comment.

> Technically I don't think it is correct for your patch to lower llvm.sqrt (with the FMF) to PTX sqrt.approx, because "The maximum absolute error for sqrt.f32 is TBD."

The patch only does this transformation with fastmath enabled (or if you pass a special flag to llvm that specifically asks for this transformation):

  defm FSQRT_f32_approx_ftz :
    FSQRT_f32<"approx.ftz.", [doF32FTZ, do_SQRTF32_APPROX]>;
   defm FSQRT_f32_approx : FSQRT_f32<"approx.", [do_SQRTF32_APPROX]>;
   defm FSQRT_f32_ftz : FSQRT_f32<"rn.ftz.", [doF32FTZ]>;
   defm FSQRT_f32_noftz : FSQRT_f32<"rn.", []>;

Surely fastmath implies we should be lowering to the approx instruction, no?

When fastmath is disabled, we lower to PTX sqrt.rn.f32, which is spec'ed to be exact.

I agree the commit message should be clearer.  :)  I think I was trying to say, now we will *under some circumstances* emit sqrt.approx.f32 for llvm.sqrt.

https://reviews.llvm.org/D28508