[PATCH] D150390: [AMDGPU] Introduce and use the new PRED_COPY opcode

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 17 05:20:45 PDT 2023


Pierre-vh added a comment.

I haven't followed this closely, but how will this interact with GlobalISel? Do we plan to introduce `PRED_COPY` as well after RegBankSelect there, or is it strictly going to be a Post-ISel thing?
If it's not making it in GISel anytime soon, is it somehow possible to ensure it cannot be found in generic MIR? We have a lot of places that check for `::COPY` and if we start having multiple COPY opcodes, I think it could break stuff (e.g. `IgnoreCopies` pattern bit).



================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2362-2364
+  if (MI.getOpcode() == AMDGPU::COPY || MI.getOpcode() == AMDGPU::PRED_COPY) {
+    return DestSourcePair{MI.getOperand(0), MI.getOperand(1)};
+  }
----------------



================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:7903
+                                        : TRI->getPhysRegBaseClass(Reg);
+  // const TargetRegisterClass *RC = MRI.getRegClass(Reg);
+  return SIRegisterInfo::isSGPRClass(RC) ? AMDGPU::COPY : AMDGPU::PRED_COPY;
----------------
leftover?


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:3372-3374
+  // PRED_COPY is anyway lowered to COPY so this flag is not necessary
+  // but it messes up with MI.getNumOperands() not returning 2
+  // let Uses = [EXEC];
----------------
Do you have more detail on the failure that's happening when this is used?
Out of correctness it might be nice to have the `implicit $exec`

(note: no strong opinion either so if other reviewers think it's fine as-is, then it's good for me too)


================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:2
+//===-- SISimplifyPredicatedCopies.cpp - Simplify Copies after regalloc
+//--------------===//
+//
----------------
formatting


================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:11-15
+/// Simplify the predicated COPY (PRED_COPY) instructions for various register
+/// classes. AMDGPU target generates PRED_COPY instruction to differentiate WWM
+/// copy from COPY. This pass generates the necessary exec mask manipulation
+/// instructions to replicate 'Whole Wave Mode' and lowers PRED_COPY back to
+/// COPY.
----------------
Small high-level nit: I think "Lowering" is the right term here. It's not simplifying them, it's completely replacing them with something else right?


================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:60-63
+INITIALIZE_PASS_BEGIN(SISimplifyPredicatedCopies, DEBUG_TYPE,
+                      "SI Simplify Predicated Copies", false, false)
+INITIALIZE_PASS_END(SISimplifyPredicatedCopies, DEBUG_TYPE,
+                    "SI Simplify Predicated Copies", false, false)
----------------



================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:80-81
+    for (MachineInstr &MI : MBB) {
+      unsigned Opcode = MI.getOpcode();
+      if (Opcode == AMDGPU::PRED_COPY) {
+
----------------
also nit: you could also use `MI.getOpcode()` directly unless you need to check the opcode again later. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150390/new/

https://reviews.llvm.org/D150390



More information about the llvm-commits mailing list