[PATCH] D57359: [GlobalISel] Introduce a G_FSQRT generic instruction

Amara Emerson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 28 16:41:33 PST 2019


aemerson added inline comments.


================
Comment at: include/llvm/Target/GenericOpcodes.td:572-577
+// Floating point square root of a value.
+def G_FSQRT : GenericInstruction {
+  let OutOperandList = (outs type0:$dst);
+  let InOperandList = (ins type0:$src1);
+  let hasSideEffects = 0;
+}
----------------
arsenm wrote:
> aemerson wrote:
> > paquette wrote:
> > > arsenm wrote:
> > > > paquette wrote:
> > > > > arsenm wrote:
> > > > > > arsenm wrote:
> > > > > > > Needs a comment about the < 0 behavior. One particularly awful detail of SelectionDAG is ISD::FSQRT 's behavior differs from the llvm.sqrt intrinsic.
> > > > > > To be more specific, the IR claims this is undefined, which is the wrong answer. This should return a NAN
> > > > > Hmm, actually, I'm thinking maybe this should be G_INTRINSIC_SQRT, so that we can be clear which cases in which this ought to be used.
> > > > > 
> > > > > Reading up on this a bit more, I found this (12-year-old) thread: http://lists.llvm.org/pipermail/llvm-dev/2007-August/010253.html
> > > > > 
> > > > > "Also, this makes llvm.sqrt side-effect free, where sqrt potentially has side-effects (making it harder to CSE, hoist out of loops etc)."
> > > > > 
> > > > > If that's true today, then I guess the llvm.sqrt case should be handled separately.
> > > > I think having both would be helpful. Maybe say G_FSQRT is side effect free and defined for NAN, but G_INTRINSIC_SQRT is undefined for NAN?
> > > I think I need to think about this a little bit more, but it sounds like a reasonable solution to me.
> > > 
> > > Wouldn't G_INTRINSIC_SQRT be the side-effect-free one though?
> > Do you have a source reference what the semantics of ISD::FSQRT is then?
> I was saying both will be side effect free. For the AMDGPU usage I want a no-errno-ever-normal-instruction with no side effects that returns NaN, which legalizations will use. We'll also pattern match to it from the intrinsic.
> 
> We currently do silly things like recognize x < 0 ? nan : sqrt(x) -> sqrt(x) in DAGCombiner. It would be slightly less concerning looking if we were matching to a different opcode.
> 
> This probably needs more thought, but I hope we can improve the global-isel situation from the current confusion. The current situation in the IR that SelectionDAGBuilder recognizes are:
> 
>   - llvm.sqrt
>   - lib call to sqrt that is readnone
>   - lib call to sqrt that is not readnone
> 
> The intrinsic and readnone lib call seem to go through the same path on x86 and use the chainless ISD::FSQRT. The not-readnone lib call somehow end up producing a conditional branch on x86, but also involves the no-side effect ISD::FSQRT. I've never been able to keep this entire situation straight, so I'm not sure what exactly is going on there for errno handling. Given that there's no chain on ISD::FSQRT there seems to be no expectation of errno setting, and somewhere x86 avoids it by inserting the branch.
> 
So the readnone lib calls come from the PartiallyInlineLibCalls pass, and that pass emits two lib calls with the readnone function attribute one being selected into an instruction, and the vanilla lib call remaining a function call. The idea is that the common case of the param >= 0 uses the fast path of a native instruction, followed by a check and conditional branch if the result shows that a real lib call is actually needed. I think this optimisation is orthogonal to the discussion since it happens at the IR level.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57359/new/

https://reviews.llvm.org/D57359





More information about the llvm-commits mailing list