[PATCH] D135447: [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU

Fri Nov 11 07:42:06 PST 2022

arsenm added a comment.

For this patch I'd like to drop all the attempts to handle legalizing the f16 case and move that to a separate patch. It's a much more complicated edge case that doesn't have much in common with the base handling

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2332
+        .addImm(TestMaskValue.getZExtValue())
+        .addImm((unsigned)APFloat::SemanticsToEnum(FpSem))
+        .setMIFlag(MachineInstr::NoFPExcept);
----------------
JanekvO wrote:
> arsenm wrote:
> > JanekvO wrote:
> > > arsenm wrote:
> > > > Do you really need the float type operand? I know bfloat16 isn't going to work without it, but I thought the plan was to introduce FP types to LLT
> > > I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode. Leaving it out results in verifier errors (I also am unaware about introducing FP types and LLT).
> > What do you mean verifier errors?
> Sorry, I meant that the MachineVerifier will fail for the `G_IS_FPCLASS` instruction.
Right, it's there in the operand list. I mean more abstractly, why is it there?

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2450
+    Observer.changingInstr(MI);
+    widenScalarSrc(MI, WideTy, 1, TargetOpcode::G_FPEXT);
+    Observer.changedInstr(MI);
----------------
JanekvO wrote:
> arsenm wrote:
> > arsenm wrote:
> > > This will do the wrong thing for snans and also denormals inputs are flushed 
> > I also don't see the corresponding DAG legalization. It's such a special case I think this should be split into a separate patch anyway.
> > I also don't see the corresponding DAG legalization.
> I put the corresponding SelectionDAG type widening code for `IS_FPCLASS` is in target custom function `LowerIS_FPCLASS` as I couldn't bypass expansion in SelectionDAGBuilder.cpp when marking the action for the instruction with f16 as `promote` (i.e., it would call `IS_FPCLASS` expansion code even when trying to promote).
> 
> > It's such a special case I think this should be split into a separate patch anyway.
> As in, the widening code, or `IS_FPCLASS` support for amdgpu gfx7?
OK, there are several issues here. None of this should be done in target code.

I also don't approve of doing this expansion in the DAG builder, but see that's a pre-existing issue. GlobalISel does need to do the same expansion. 

================
Comment at: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:6492-6493
     // legalizing these types prior to selection.
-    if (!TLI.isOperationLegalOrCustom(ISD::IS_FPCLASS, ArgVT)) {
+    if (!TLI.isOperationLegal(ISD::IS_FPCLASS, ArgVT) &&
+        !TLI.isOperationCustom(ISD::IS_FPCLASS, ArgVT)) {
       SDValue Result = TLI.expandIS_FPCLASS(DestVT, Op, Test, Flags, sdl, DAG);
----------------
JanekvO wrote:
> foad wrote:
> > Why did this change?
> `isOperationLegalOrCustom` will return false if the type is considered illegal regardless of whether the instruction's type is marked legal or custom whereas `isOperationCustom` won't explicitly check for type legality and returns whether the action was set to custom. I basically just wanted it to go through to target custom code.
> 
> (May revert this in favor of using the expand code for f16 in case there is no f16 fp class instruction for amdgpu)
For the no f16 case, I think we need to do software expansion to get correct results for denormal values

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:318-320
+      {MVT::v2f16, MVT::v3f16, MVT::v4f16, MVT::v16f16, MVT::v2f32, MVT::v3f32,
+       MVT::v4f32, MVT::v5f32, MVT::v6f32, MVT::v7f32, MVT::v8f32, MVT::v16f32,
+       MVT::v2f64, MVT::v3f64, MVT::v4f64, MVT::v8f64, MVT::v16f64},
----------------
foad wrote:
> It seems annoying to have such a long list of types here - it'll need updating whenever we introduce a new one. Can you use something like FloatVectorTypes instead?
This should be unnecessary, we have no vector class instructions. These should just expand into scalars

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2778
+
+  SDValue FpExtend = DAG.getFPExtendOrRound(Src, DL, MVT::f32);
+  SDValue LegalIsFpClass = DAG.getNode(ISD::IS_FPCLASS, DL, Op.getValueType(), {FpExtend, Op.getOperand(1)}, Op->getFlags());
----------------
This doesn't work correctly for denormals. The f16 denormal value won't be denormal after casting to f32 (if it wasn't flushed to zero under DAZ or FTZ modes)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135447/new/

https://reviews.llvm.org/D135447