[PATCH] D28314: Change sqrt partial inlining to depend on sqrt argument rather than result.

Thu Jan 5 04:04:25 PST 2017

sdardis added a comment.

This change is increasing the branch density for MIPS in the supplied test case and register pressure, as LLVM now has to synthesise 0.0 into a floating point register. This in turn also decreases code density for MIPS as we can't load 0.0 in a single instruction like x86 in all cases.

================
Comment at: lib/Transforms/Scalar/PartiallyInlineLibCalls.cpp:45-48
   // v0 = sqrt_noreadmem(src) # native sqrt instruction.
-  // if (v0 is a NaN)
+  // if (src < 0)
   //   v1 = sqrt(src)         # library call.
   // dst = phi(v0, v1)
----------------
Shouldn't this be:

   // if (src > 0)
   //   v0 = sqrt_noreadmem(src) # native sqrt instruction
   // else
   //   v1 = sqrt(src) # library call
   // dst = phi(v0, v1)

================
Comment at: test/CodeGen/Mips/optimize-fp-math.ll:7
+; 32: c.ult.s $f[[R0:[0-9]+]], $f[[R1:[0-9]+]]
+; 32: sqrt.s $f[[R1]], $f[[R0]]
 ; 64-LABEL: test_sqrtf_float_:
----------------
This should be:

; 32-LABEL: test_sqrtf_float_:
; 32: mtc1 $zero, $f[[R0:[0-9]+]]
; 32: c.ult.s $f12, $f[[R0]]
; 32: bc1t $BB0_[[BB0:[0-9]+]]
; 32: sqrt.s $f0, $f12
; 32: $BB0_[[BB0]]:
; 32: jal sqrtf

Similarly for the 64 case.

================
Comment at: test/CodeGen/Mips/optimize-fp-math.ll:21
 ; 32-LABEL: test_sqrt_double_:
-; 32: sqrt.d $f[[R0:[0-9]+]], $f{{[0-9]+}}
-; 32: c.un.d $f[[R0]], $f[[R0]]
+; 32: c.ult.d $f[[R0:[0-9]+]], $f[[R1:[0-9]+]]
+; 32: sqrt.d $f[[R1]], $f[[R0]]
----------------
Similar to my comment above, except only the first mtc1 has to be matched.

https://reviews.llvm.org/D28314