[PATCH] D150390: [AMDGPU] Introduce and use the new PRED_COPY opcode
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 17 05:20:45 PDT 2023
Pierre-vh added a comment.
I haven't followed this closely, but how will this interact with GlobalISel? Do we plan to introduce `PRED_COPY` as well after RegBankSelect there, or is it strictly going to be a Post-ISel thing?
If it's not making it in GISel anytime soon, is it somehow possible to ensure it cannot be found in generic MIR? We have a lot of places that check for `::COPY` and if we start having multiple COPY opcodes, I think it could break stuff (e.g. `IgnoreCopies` pattern bit).
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2362-2364
+ if (MI.getOpcode() == AMDGPU::COPY || MI.getOpcode() == AMDGPU::PRED_COPY) {
+ return DestSourcePair{MI.getOperand(0), MI.getOperand(1)};
+ }
----------------
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:7903
+ : TRI->getPhysRegBaseClass(Reg);
+ // const TargetRegisterClass *RC = MRI.getRegClass(Reg);
+ return SIRegisterInfo::isSGPRClass(RC) ? AMDGPU::COPY : AMDGPU::PRED_COPY;
----------------
leftover?
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:3372-3374
+ // PRED_COPY is anyway lowered to COPY so this flag is not necessary
+ // but it messes up with MI.getNumOperands() not returning 2
+ // let Uses = [EXEC];
----------------
Do you have more detail on the failure that's happening when this is used?
Out of correctness it might be nice to have the `implicit $exec`
(note: no strong opinion either so if other reviewers think it's fine as-is, then it's good for me too)
================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:2
+//===-- SISimplifyPredicatedCopies.cpp - Simplify Copies after regalloc
+//--------------===//
+//
----------------
formatting
================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:11-15
+/// Simplify the predicated COPY (PRED_COPY) instructions for various register
+/// classes. AMDGPU target generates PRED_COPY instruction to differentiate WWM
+/// copy from COPY. This pass generates the necessary exec mask manipulation
+/// instructions to replicate 'Whole Wave Mode' and lowers PRED_COPY back to
+/// COPY.
----------------
Small high-level nit: I think "Lowering" is the right term here. It's not simplifying them, it's completely replacing them with something else right?
================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:60-63
+INITIALIZE_PASS_BEGIN(SISimplifyPredicatedCopies, DEBUG_TYPE,
+ "SI Simplify Predicated Copies", false, false)
+INITIALIZE_PASS_END(SISimplifyPredicatedCopies, DEBUG_TYPE,
+ "SI Simplify Predicated Copies", false, false)
----------------
================
Comment at: llvm/lib/Target/AMDGPU/SISimplifyPredicatedCopies.cpp:80-81
+ for (MachineInstr &MI : MBB) {
+ unsigned Opcode = MI.getOpcode();
+ if (Opcode == AMDGPU::PRED_COPY) {
+
----------------
also nit: you could also use `MI.getOpcode()` directly unless you need to check the opcode again later.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D150390/new/
https://reviews.llvm.org/D150390
More information about the llvm-commits
mailing list