[PATCH] D128252: [AMDGPU] VGPR to SGPR copies lowering

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 12 03:02:19 PDT 2022


foad added a comment.

> [AMDGPU] VGPR to SGPR copies lowering

Needs a better subject line. Previously moving all VGPR uses from SALU to VALU was required for correctness. Now it is not, so this pass is a heuristic that decides whether to move SALU to VALU or to insert a readfirstlane instruction.

In future I would like VGPR-to-SGPR copies to be legal, and always implemented as a readfirstlane, so that this whole pass could be truly optional and we could skip it at -O0.



================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:97
 
-  bool runOnMachineFunction(MachineFunction &MF) override;
+  bool runOnMachineFunction(MachineFunction& MF) override;
+  void lowerVGPR2SGPRCopies(MachineFunction &MF);
----------------
`&` should be after space, not before. Doesn't clang-format fix this? Please run `git clang-format @^` on the patch anyway.


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:99
+  void lowerVGPR2SGPRCopies(MachineFunction &MF);
+  bool LowerSpecialCase(MachineInstr& MI);
 
----------------
Likewise.


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:574
 
+  // Is kept aside to process V2S copies before the rest of the stuff
+  lowerVGPR2SGPRCopies(MF);
----------------
I don't understand this comment.


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1115
+    MachineBasicBlock *MBB = MI->getParent();
+    // We decide to turn V2S copy to v_readfirstlanre_b32
+    // remove it from the V2SCopies and remove it from all its siblings
----------------
Typo readfirstlane


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1119
+      " Score: " << C.second.Score << "\n");
+    uint16_t SubRegs[4] = {AMDGPU::sub0, AMDGPU::sub1, AMDGPU::sub2,
+                            AMDGPU::sub3};
----------------
Should be const. But why is 4 enough? Isn't there some way you can get this programmatically from SIRegisterInfo?


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1126
+    const TargetRegisterClass *SrcRC = TRI->getRegClassForReg(*MRI, SrcReg);
+    if (IsSubReg)
+      SrcRC = TRI->getSubRegClass(SrcRC, SubReg);
----------------
Don't need the "if", you can call getSubRegClass unconditionally.


================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1131
+                          TII->get(AMDGPU::V_READFIRSTLANE_B32), DstReg);
+      if (IsSubReg)
+        MIB.addReg(SrcReg, 0, SubReg);
----------------
Don't need the "if", you can use the subreg version in all cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128252/new/

https://reviews.llvm.org/D128252



More information about the llvm-commits mailing list