[PATCH] D135447: [AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU
Janek van Oirschot via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 8 07:48:19 PST 2022
JanekvO added inline comments.
================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2320
+ case Intrinsic::is_fpclass: {
+ unsigned Flags = MachineInstr::copyFlagsFromInstruction(CI);
+
----------------
arsenm wrote:
> This should get an IRTranslator test to make sure the flags are passed through
Not sure if I completely hit the mark with my added test, but to me it seemed that not all flags were possible (e.g., `nnan` flag didn't work as it required a fp return type). For now I've added flag related tests that explicitly test the addition of `nofpexcept`. Do let me know if there's something missing or whether this `copyFlagsFromInstruction` is better omitted.
================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2332
+ .addImm(TestMaskValue.getZExtValue())
+ .addImm((unsigned)APFloat::SemanticsToEnum(FpSem))
+ .setMIFlag(MachineInstr::NoFPExcept);
----------------
arsenm wrote:
> Do you really need the float type operand? I know bfloat16 isn't going to work without it, but I thought the plan was to introduce FP types to LLT
I believe it's not necessary for amdgpu but required for the `G_IS_FPCLASS` target opcode. Leaving it out results in verifier errors (I also am unaware about introducing FP types and LLT).
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:870
+
+bool AMDGPUInstructionSelector::selectG_IS_FPCLASS(MachineInstr &I) const {
+ MachineBasicBlock *BB = I.getParent();
----------------
arsenm wrote:
> I don't see why you need to manually select this (maybe sharing the pattern between the existing intrinsic is annoying because the new intrinsic uses immarg?)
I did look on whether I could re-use some of the existing tablegen but I couldn't get it quite into the right shape for it to match. `llvm.is.fpclass` requires the mask to be an immarg as you mentioned so materializing the immediate into a register anywhere before this function results in a verifier error.
================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s | FileCheck --check-prefix=GFX9SELDAG %s
+; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx908 -verify-machineinstrs < %s | FileCheck --check-prefix=GFX9GLISEL %s
----------------
arsenm wrote:
> Should use some share prefixes, a lot of these functions are the same. Also needs a gfx7 and 8 run lines for the half promotion
I'm not that well versed in how gfx7 should do half promotion. I feel like either gfx7 selectiondag or gfx7 globalisel half promotion tests are incorrect (and if not, selectiondag version does seem suboptimal).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135447/new/
https://reviews.llvm.org/D135447
More information about the llvm-commits
mailing list