[PATCH] D134354: [AMDGPU][GlobalISel] Support mad/fma_mix selection

Wed Sep 21 06:36:14 PDT 2022

arsenm added a comment.

I'd prefer to split the legalization changes from the patch to enable FMA handling

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:599-601
+    return MI.getOpcode() == AMDGPU::G_BUILD_VECTOR
+               ? false
+               : selectImpl(MI, *CoverageInfo);
----------------
Weird ternary usage. Should use some returns? Also should just consolidate the G_BUILD_VECTOR handling above

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3695
+  // TODO: Should be a combine instead
+  if (MI && MI->getOpcode() == AMDGPU::G_FSUB) {
+    MachineInstr *LHS = getDefIgnoringCopies(MI->getOperand(1).getReg(), *MRI);
----------------
MI cannot be null. We should remove the null checks elsewhere here

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:5070
+  if (Inst->getOpcode() != AMDGPU::G_TRUNC &&
+      Inst->getOpcode() != AMDGPU::G_FPTRUNC)
+    return nullptr;
----------------
FPTRUNC would be ineligible

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:5079-5082
+    MachineInstr *SrlAmount =
+        getDefIgnoringCopies(TruncOp->getOperand(2).getReg(), MRI);
+    if (SrlAmount->getOpcode() == AMDGPU::G_CONSTANT &&
+        SrlAmount->getOperand(1).getCImm()->getZExtValue() == 16) {
----------------
Should use the constant matcher here. There shouldn't be any copies of the constant (although we're still missing a regbank constant optimization)

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:1531
+        .minScalarOrElt(0, S16)
+        // Widen source elements and produce a G_BUILD_VECTOR_TRUNC
+        .minScalar(1, S16);
----------------
Comment needs updating

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1749-1756
+/*
+foreach ext = [sext, zext, anyext] in {
+  def : GCNPat <
+    (i32 (shl (i32 (ext i16:$src)), (i32 16))),
+    (V_LSHLREV_B32_e64 (i32 16), $src)
+  >;
+}
----------------
Commented out code. Plus there's a better way to handle this using PatFrags if we really need this

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mad-mix.ll:1-4
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX900,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx906 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX906,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,VI %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=hawaii -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,CI %s
----------------
This is a copy pasted version of the existing test. I'd assume they can share the same runlines in the same file

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mad-mix.ll:388
+
+; FIXME: Should abe able to select in thits case
+; All sources are converted from f16, so it doesn't matter
----------------
Typo thits

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134354/new/

https://reviews.llvm.org/D134354