[PATCH] D134354: [AMDGPU][GlobalISel] Support mad/fma_mix selection
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 23 00:13:23 PDT 2022
Pierre-vh added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mad-mix.ll:1-4
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx900 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX900,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=gfx906 -verify-machineinstrs -show-mc-encoding < %s | FileCheck -enable-var-scope -check-prefixes=GCN,GFX906,GFX9 %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,VI %s
+; RUN: llc -global-isel -march=amdgcn -mcpu=hawaii -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN,CIVI,CI %s
----------------
Pierre-vh wrote:
> arsenm wrote:
> > Pierre-vh wrote:
> > > arsenm wrote:
> > > > This is a copy pasted version of the existing test. I'd assume they can share the same runlines in the same file
> > > Some run lines are different unfortunately, see FIXMEs in the test
> > I'd work to eliminate those differences. If it turns out to be difficult, I would probably switch to generated checks and have them share the same file
> The remaining difference fall in the following categories:
> - isCanonicalized has different behaviour between GISel/DAG, so there's v_mac instead of v_mad in a few places since SIFoldOperand doesn't fold
> - there's also v_madak_f32 that's no longer present, I think it's the same cause but I haven't looked into it yet
> - op_sel is on the second v_mad_mix instead of the first, seems like a harmless difference due to how DAG/GISel work?
> - A G_LSHR + G_SHUFFLEVECTOR (1,0) pair isn't folded out. I think those operations negate each other, perhaps a combine should be added for that?
> - Some unfinished things like -[v0| not being picked up yet.
>
> The last 2 definitely have to be fixed, I'll look into them ASAP, but are the first 2 important as well? I'm not sure of what to do with `isCanonicalized`, is there a place where I can find the list of operations that should go in there?
For `-|v0|`, the gMIR looks like this:
```
bb.1 (%ir-block.0):
liveins: $vgpr0, $vgpr1, $vgpr2
%0:_(s32) = COPY $vgpr0
%3:_(s32) = COPY $vgpr1
%1:_(s16) = G_TRUNC %3:_(s32)
%4:_(s32) = COPY $vgpr2
%2:_(s16) = G_TRUNC %4:_(s32)
%8:_(s16) = G_FCONSTANT half 0xH8000
%7:_(<2 x s16>) = G_BUILD_VECTOR %8:_(s16), %8:_(s16)
%5:_(<2 x s16>) = G_BITCAST %0:_(s32)
%6:_(<2 x s16>) = G_FABS %5:_
%18:_(<2 x s16>) = G_FNEG %6:_
%9:_(<2 x s16>) = G_FADD %7:_, %18:_
%19:_(s32) = G_BITCAST %9:_(<2 x s16>)
%20:_(s32) = G_CONSTANT i32 16
%21:_(s32) = G_LSHR %19:_, %20:_(s32)
%17:_(s16) = G_TRUNC %21:_(s32)
%12:_(s32) = G_FPEXT %17:_(s16)
%13:_(s32) = G_FPEXT %1:_(s16)
%14:_(s32) = G_FPEXT %2:_(s16)
%15:_(s32) = G_FMA %12:_, %13:_, %14:_
$vgpr0 = COPY %15:_(s32)
SI_RETURN implicit $vgpr0
```
Could we add a combine to fold `G_FADD (+-)0.0, x` into just `x`?
If we add that and another one to fold ` G_LSHR + G_SHUFFLEVECTOR (1,0)`, it should address most of the remaining differences.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134354/new/
https://reviews.llvm.org/D134354
More information about the llvm-commits
mailing list