[PATCH] D128252: [AMDGPU] VGPR to SGPR copies lowering
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 12 03:02:19 PDT 2022
foad added a comment.
> [AMDGPU] VGPR to SGPR copies lowering
Needs a better subject line. Previously moving all VGPR uses from SALU to VALU was required for correctness. Now it is not, so this pass is a heuristic that decides whether to move SALU to VALU or to insert a readfirstlane instruction.
In future I would like VGPR-to-SGPR copies to be legal, and always implemented as a readfirstlane, so that this whole pass could be truly optional and we could skip it at -O0.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:97
- bool runOnMachineFunction(MachineFunction &MF) override;
+ bool runOnMachineFunction(MachineFunction& MF) override;
+ void lowerVGPR2SGPRCopies(MachineFunction &MF);
----------------
`&` should be after space, not before. Doesn't clang-format fix this? Please run `git clang-format @^` on the patch anyway.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:99
+ void lowerVGPR2SGPRCopies(MachineFunction &MF);
+ bool LowerSpecialCase(MachineInstr& MI);
----------------
Likewise.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:574
+ // Is kept aside to process V2S copies before the rest of the stuff
+ lowerVGPR2SGPRCopies(MF);
----------------
I don't understand this comment.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1115
+ MachineBasicBlock *MBB = MI->getParent();
+ // We decide to turn V2S copy to v_readfirstlanre_b32
+ // remove it from the V2SCopies and remove it from all its siblings
----------------
Typo readfirstlane
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1119
+ " Score: " << C.second.Score << "\n");
+ uint16_t SubRegs[4] = {AMDGPU::sub0, AMDGPU::sub1, AMDGPU::sub2,
+ AMDGPU::sub3};
----------------
Should be const. But why is 4 enough? Isn't there some way you can get this programmatically from SIRegisterInfo?
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1126
+ const TargetRegisterClass *SrcRC = TRI->getRegClassForReg(*MRI, SrcReg);
+ if (IsSubReg)
+ SrcRC = TRI->getSubRegClass(SrcRC, SubReg);
----------------
Don't need the "if", you can call getSubRegClass unconditionally.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1131
+ TII->get(AMDGPU::V_READFIRSTLANE_B32), DstReg);
+ if (IsSubReg)
+ MIB.addReg(SrcReg, 0, SubReg);
----------------
Don't need the "if", you can use the subreg version in all cases.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128252/new/
https://reviews.llvm.org/D128252
More information about the llvm-commits
mailing list