[PATCH] D135447: [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU

Tue Nov 8 07:48:19 PST 2022

JanekvO added inline comments.

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2320
+  case Intrinsic::is_fpclass: {
+    unsigned Flags = MachineInstr::copyFlagsFromInstruction(CI);
+
----------------
arsenm wrote:
> This should get an IRTranslator test to make sure the flags are passed through
Not sure if I completely hit the mark with my added test, but to me it seemed that not all flags were possible (e.g., `nnan` flag didn't work as it required a fp return type). For now I've added flag related tests that explicitly test the addition of `nofpexcept`. Do let me know if there's something missing or whether this `copyFlagsFromInstruction` is better omitted.

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2332
+        .addImm(TestMaskValue.getZExtValue())
+        .addImm((unsigned)APFloat::SemanticsToEnum(FpSem))
+        .setMIFlag(MachineInstr::NoFPExcept);
----------------
arsenm wrote:
> Do you really need the float type operand? I know bfloat16 isn't going to work without it, but I thought the plan was to introduce FP types to LLT
I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode. Leaving it out results in verifier errors (I also am unaware about introducing FP types and LLT).

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:870
+
+bool AMDGPUInstructionSelector::selectG_IS_FPCLASS(MachineInstr &I) const {
+  MachineBasicBlock *BB = I.getParent();
----------------
arsenm wrote:
> I don't see why you need to manually select this (maybe sharing the pattern between the existing intrinsic is annoying because the new intrinsic uses immarg?)
I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite into the right shape for it to match. `llvm.is.fpclass` requires the mask to be an immarg as you mentioned so materializing the immediate into a register anywhere before this function results in a verifier error.

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN:  llc -global-isel=0 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s  | FileCheck --check-prefix=GFX9SELDAG %s
+; RUN:  llc -global-isel=1 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s  | FileCheck --check-prefix=GFX9GLISEL %s
----------------
arsenm wrote:
> Should use some share prefixes, a lot of these functions are the same. Also needs a gfx7 and 8 run lines for the half promotion 
I'm not that well versed in how gfx7 should do half promotion. I feel like either gfx7 selectiondag or gfx7 globalisel half promotion tests are incorrect (and if not, selectiondag version does seem suboptimal).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135447/new/

https://reviews.llvm.org/D135447