[PATCH] D85709: [InstSimplify] Implement Instruction simplification for X/sqrt(X) to sqrt(X).

Thu Aug 13 00:34:34 PDT 2020

venkataramanan.kumar.llvm added a comment.

In D85709#2211204 <https://reviews.llvm.org/D85709#2211204>, @spatel wrote:

> After looking at the codegen, I'm not sure if we can do this transform in IR with the expected performance in codegen because the transform loses information:
> https://godbolt.org/z/7b84rG
>
> The codegen for the case of "sqrt(x)" has to account for a 0.0 input. Ie, we filter out a 0.0 (or potentially denorm) input to avoid the NAN answer that we would get from "0.0 / 0.0". But the codegen for the case of "x/sqrt(x)" does not have to do that - NAN is the correct answer for a 0.0 input, so the code has implicitly signaled to us that 0.0 is not a valid input when compiled with -ffast-math (we can ignore possible NANs).
>
> It might help to see the motivating code that produces the x/sqrt(x) pattern to see if there's something else we should be doing there.

Current AMD "x86_64" targets don't have the reciprocal sqrt instruction for the double precision types. 
so x/sqrt(x) ends up with "vsqrtsd" followed by "vdivsd". This transform is basically to improve the efficiency.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D85709/new/

https://reviews.llvm.org/D85709