[PATCH] D54882: [AMDGPU] Add sdwa support for ADD|SUB U64 decomposed Pseudos

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 26 08:49:27 PST 2018


rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:882
+  int Opc = MI.getOpcode();
+  Opc = AMDGPU::getVOPe32(Opc);
+  if (Opc == -1)
----------------
Having a VOP2 pseudo does not necessarily mean there is target VOP2 instruction. I would suggest calling pseudoToMCOpcode() in addition on that VOP2 opcode.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:897
+  Opc = MISucc.getOpcode();
+  Opc = AMDGPU::getVOPe32(Opc);
+  if (Opc == -1)
----------------
Same here.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:900
+    return false;
+  const MCInstrDesc &SDWADesc = TII->get(Opc);
+  MachineOperand *Vdst = TII->getNamedOperand(MISucc, AMDGPU::OpName::vdst);
----------------
This is not SDWADesc, it is just VOP2Desc.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:922
+  // Make the new instruction.
+  auto NewInst =
+    BuildMI(*MISucc.getParent(), MISucc, MISucc.getDebugLoc(), SDWADesc);
----------------
You need to check for modifiers which you are going to drop by this conversion. For instance incoming instruction can be VOP3 OpSel. If it is OpSel you need to check that OpSel modifiers are trivial (e.g. have no effect and equivalent to VOP2). The same about VOP3 modifiers abs and neg.

You also need a mir test to check all negative cases when you cannot fold.


================
Comment at: lib/Target/AMDGPU/SIPeepholeSDWA.cpp:961-962
 
+  } else if (MI.getOpcode() == AMDGPU::V_ADD_I32_e64 ||
+             MI.getOpcode() == AMDGPU::V_SUB_I32_e64) {
+    if (!opConvertedToVOP2(MI, ST))
----------------
arsenm wrote:
> You have code checking for the carry ins, but you don't handle those here
Right. It does not seem to be specific to just these instructions. In general any VOP3 can go through it.


================
Comment at: test/CodeGen/AMDGPU/sdwa-op64-test.ll:1
+; RUN: llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck -check-prefixes=GFX9,GCN %s
+
----------------
You need to add fiji run line.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54882/new/

https://reviews.llvm.org/D54882





More information about the llvm-commits mailing list