[PATCH] D134354: [AMDGPU][GlobalISel] Support mad/fma_mix selection
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 21 06:36:14 PDT 2022
arsenm added a comment.
I'd prefer to split the legalization changes from the patch to enable FMA handling
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:599-601
+ return MI.getOpcode() == AMDGPU::G_BUILD_VECTOR
+ ? false
+ : selectImpl(MI, *CoverageInfo);
----------------
Weird ternary usage. Should use some returns? Also should just consolidate the G_BUILD_VECTOR handling above
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3695
+ // TODO: Should be a combine instead
+ if (MI && MI->getOpcode() == AMDGPU::G_FSUB) {
+ MachineInstr *LHS = getDefIgnoringCopies(MI->getOperand(1).getReg(), *MRI);
----------------
MI cannot be null. We should remove the null checks elsewhere here
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:5070
+ if (Inst->getOpcode() != AMDGPU::G_TRUNC &&
+ Inst->getOpcode() != AMDGPU::G_FPTRUNC)
+ return nullptr;
----------------
FPTRUNC would be ineligible
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:5079-5082
+ MachineInstr *SrlAmount =
+ getDefIgnoringCopies(TruncOp->getOperand(2).getReg(), MRI);
+ if (SrlAmount->getOpcode() == AMDGPU::G_CONSTANT &&
+ SrlAmount->getOperand(1).getCImm()->getZExtValue() == 16) {
----------------
Should use the constant matcher here. There shouldn't be any copies of the constant (although we're still missing a regbank constant optimization)
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:1531
+ .minScalarOrElt(0, S16)
+ // Widen source elements and produce a G_BUILD_VECTOR_TRUNC
+ .minScalar(1, S16);
----------------
Comment needs updating
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1749-1756
+/*
+foreach ext = [sext, zext, anyext] in {
+ def : GCNPat <
+ (i32 (shl (i32 (ext i16:$src)), (i32 16))),
+ (V_LSHLREV_B32_e64 (i32 16), $src)
+ >;
+}
----------------
Commented out code. Plus there's a better way to handle this using PatFrags if we really need this
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mad-mix.ll:1-4
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX900,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx906 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX906,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,VI %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=hawaii -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,CI %s
----------------
This is a copy pasted version of the existing test. I'd assume they can share the same runlines in the same file
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mad-mix.ll:388
+
+; FIXME: Should abe able to select in thits case
+; All sources are converted from f16, so it doesn't matter
----------------
Typo thits
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134354/new/
https://reviews.llvm.org/D134354
More information about the llvm-commits
mailing list