-fp-rsqrt flag along with -enable-unsafe-fp-math flag controls the generatation of X86 rsqrt instruction generation with some minor
precision variations as illustrated below with different examples. Some derived optimizations(e.g. more FMA generation) become possible 
from generated rsqrt and multiplication instructions.  

  -fp-rsqrt                           -   Enable rsqrt ops
    =off                              -   No rsqrt
    =on                               -   y/sqrt(x) => y * rsqrt(x)
    =advance                          -   Standard, sqrt(x) => x * rsqrt(x)

Generate LLVM IR by
gfortran -S -O1 -o - -fplugin=dragonegg.so rsqrt_on.f -fplugin-arg-dragonegg-emit-ir
.ll -> .s file generation by
llc -O1 -enable-unsafe-fp-math -fp-rsqrt=off/on/advance/fda rsqrt_on.ll -filetype=asm


Example 1. 
    Source code

      real*4     x
      real*4     y
      real*4     r
      r = y/sqrt(x)

Input
-------
x = 3.0
y = 2.0

LLVM IR

  %0 = load float* %y, align 4
  %1 = load float* %x, align 4
  %2 = tail call float @sqrtf(float %1) nounwind readnone
  %3 = fdiv float %0, %2
  store float %3, float* %r, align 4

-fp-rsqrt=off

        vmovss  (%rdi), %xmm0
        vsqrtss %xmm0, %xmm0, %xmm0
        vmovss  (%rsi), %xmm1
        vdivss  %xmm0, %xmm1, %xmm0
        vmovss  %xmm0, (%rdx)


-fp-rsqrt=on/advance

        vmovss  (%rdi), %xmm0
        vrsqrtss        %xmm0, %xmm0, %xmm0
        vmulss  (%rsi), %xmm0, %xmm0
        vmovss  %xmm0, (%rdx)

Input
-------
x = 3.0
y = 2.0

Output
---------
without rsqrt :: r =    1.15470052
with rsqrt :: r =    1.15469360

Example 2.

Source

      real*4     x
      real*4     y
      real*4     r
      r           = sqrt(x)

LLVM IR

  %0 = load float* %x, align 4
  %1 = tail call float @sqrtf(float %0) nounwind readnone
  store float %1, float* %r, align 4

-fp-rsqrt=off/on rsqrt_advance.ll -filetype=asm

        vmovss  (%rdi), %xmm0
        vsqrtss %xmm0, %xmm0, %xmm0
        vmovss  %xmm0, (%rdx)

-fp-rsqrt=advance

        vmovss  (%rdi), %xmm0
        vrsqrtss        %xmm0, %xmm0, %xmm1
        vmulss  %xmm1, %xmm0, %xmm0
        vmovss  %xmm0, (%rdx)

Input
-------
x=2.0

Output
---------
without rsqrt:: r =    1.41421354
with rsqrt::      r =    1.41419983

Example 3

Source

      real*4     x
      real*4     y
      real*4     z
      real*4     t
      real*4     r
      t           = x+y/sqrt(z)
      r           = x+sqrt(t)


LLVM-IR

  %0 = load float* %x, align 4
  %1 = load float* %y, align 4
  %2 = load float* %z, align 4
  %3 = tail call float @sqrtf(float %2) nounwind readnone
  %4 = fdiv float %1, %3
  %5 = fadd float %0, %4
  store float %5, float* %t, align 4
  %6 = load float* %x, align 4
  %7 = tail call float @sqrtf(float %5) nounwind readnone
  %8 = fadd float %6, %7
  store float %8, float* %r, align 4

-fp-rsqrt=on/advance

        vmovss  (%rdx), %xmm0
        vrsqrtss        %xmm0, %xmm0, %xmm0
        vmovss  (%rsi), %xmm1
        vfmadd213ss     (%rdi), %xmm1, %xmm0
        vmovss  %xmm0, (%rcx)
        vxorps  %xmm1, %xmm1, %xmm1
        vrsqrtss        %xmm0, %xmm0, %xmm1
        vfmadd213ss     (%rdi), %xmm0, %xmm1
        vmovss  %xmm1, (%r8)

Input
------
x=1.0, y = 2.0, z=3.0

Output
--------
-fp-rsqrt=off :   t =    2.15469360      r =    2.46788836
-fp-rsqrt=on/advance:    t =    2.15469360      r =    2.46788836