[PATCH] Optimize square root squared (PR21126)

Sanjay Patel spatel at rotateright.com
Thu Oct 2 14:08:47 PDT 2014


================
Comment at: test/Transforms/InstCombine/fmul.ll:130
@@ +129,3 @@
+declare double @llvm.sqrt.f64(double)
+attributes #0 = { "unsafe-fp-math"="true" }
+
----------------
hfinkel wrote:
> spatel wrote:
> > hfinkel wrote:
> > > spatel wrote:
> > > > hfinkel wrote:
> > > > > We don't need the "unsafe-fp-math"="true" because you're just checking the 'fast' on the fmul, right? If you don't need it, please remove it.
> > > > Thanks, Hal! Yes - the 'fast' alone is enough to do this optimization. I'm not a fan of the instruction-level fast-math flags...because it leads to another question. What should happen in the 2nd test case if it looks like this:
> > > > 
> > > > define double @sqrt_squared2(double %f) #0 {
> > > >   %sqrt = call double @llvm.sqrt.f64(double %f)
> > > >   %mul1 = fmul fast double %sqrt, %sqrt
> > > >   %mul2 = fmul double %mul1, %sqrt
> > > >   ret double %mul2
> > > > }
> > > > 
> > > > When the %mul1 operand is replaced in the 2nd fmul instruction, does that fmul instruction now become fast too? Is fast infectious?
> > > I think that you should change your mind ;)
> > > 
> > > The motivation behind the fast-math flags, in addition to providing client more-precise control over what instructions might be folded, was to create a "fast math" system that could survive inlining. As a result, the answer to your last question is no, it is not infectious, it is designed so that inlined "fast math" functions can have their instructions retain that property without affecting operations from the caller (which might not be "fast math" -- a particular problem when you consider LTO).
> > > 
> > I'd love to love this. :)
> > 
> > ..but, you see what happens in this example:
> > 
> > 1. There's no function-level attribute for unsafe-math. 
> > 
> > 2. When we kill the first fmul, we eliminate the only trace of 'fast'-ness in this function (ignore for the moment that the llvm.sqrt intrinsic isn't IEEE-754 compliant).
> > 
> > 3. After the optimization, we have a seemingly safe sqrt and a safe fmul. But the entirety of the code that will be produced for this function is decidedly unsafe. We just deleted the knowledge that the original function was not safe.
> > 
> > So in fact, fast is infectious, but silently so! LTO users should be aware that linking any unsafe code produces an overall unsafe program?
> > 
> > 
> Sure, but that's normally what the user wants! They want fast semantics for the routines in one module, and precise semantics for routines in another, then they want LTO to inline one into the other.
> 
> The fast flags are not meant to preserve a record of non-strict semantics -- that's not the optimizers job ;) --  but rather are intended to allow optimization control.
Ok, sold. :)

I was just pretending to be an FP wonk. If the real FP wonks complain, please protect this optimization!

I'll fix up the testcase and get this checked in.

http://reviews.llvm.org/D5584






More information about the llvm-commits mailing list