[PATCH] D57359: [GlobalISel] Introduce a G_FSQRT generic instruction

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 28 16:33:46 PST 2019


arsenm added inline comments.


================
Comment at: include/llvm/Target/GenericOpcodes.td:572-577
+// Floating point square root of a value.
+def G_FSQRT : GenericInstruction {
+  let OutOperandList = (outs type0:$dst);
+  let InOperandList = (ins type0:$src1);
+  let hasSideEffects = 0;
+}
----------------
aemerson wrote:
> paquette wrote:
> > arsenm wrote:
> > > paquette wrote:
> > > > arsenm wrote:
> > > > > arsenm wrote:
> > > > > > Needs a comment about the < 0 behavior. One particularly awful detail of SelectionDAG is ISD::FSQRT 's behavior differs from the llvm.sqrt intrinsic.
> > > > > To be more specific, the IR claims this is undefined, which is the wrong answer. This should return a NAN
> > > > Hmm, actually, I'm thinking maybe this should be G_INTRINSIC_SQRT, so that we can be clear which cases in which this ought to be used.
> > > > 
> > > > Reading up on this a bit more, I found this (12-year-old) thread: http://lists.llvm.org/pipermail/llvm-dev/2007-August/010253.html
> > > > 
> > > > "Also, this makes llvm.sqrt side-effect free, where sqrt potentially has side-effects (making it harder to CSE, hoist out of loops etc)."
> > > > 
> > > > If that's true today, then I guess the llvm.sqrt case should be handled separately.
> > > I think having both would be helpful. Maybe say G_FSQRT is side effect free and defined for NAN, but G_INTRINSIC_SQRT is undefined for NAN?
> > I think I need to think about this a little bit more, but it sounds like a reasonable solution to me.
> > 
> > Wouldn't G_INTRINSIC_SQRT be the side-effect-free one though?
> Do you have a source reference what the semantics of ISD::FSQRT is then?
I was saying both will be side effect free. For the AMDGPU usage I want a no-errno-ever-normal-instruction with no side effects that returns NaN, which legalizations will use. We'll also pattern match to it from the intrinsic.

We currently do silly things like recognize x < 0 ? nan : sqrt(x) -> sqrt(x) in DAGCombiner. It would be slightly less concerning looking if we were matching to a different opcode.

This probably needs more thought, but I hope we can improve the global-isel situation from the current confusion. The current situation in the IR that SelectionDAGBuilder recognizes are:

  - llvm.sqrt
  - lib call to sqrt that is readnone
  - lib call to sqrt that is not readnone

The intrinsic and readnone lib call seem to go through the same path on x86 and use the chainless ISD::FSQRT. The not-readnone lib call somehow end up producing a conditional branch on x86, but also involves the no-side effect ISD::FSQRT. I've never been able to keep this entire situation straight, so I'm not sure what exactly is going on there for errno handling. Given that there's no chain on ISD::FSQRT there seems to be no expectation of errno setting, and somewhere x86 avoids it by inserting the branch.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57359/new/

https://reviews.llvm.org/D57359





More information about the llvm-commits mailing list