[llvm] [AMDGPU] Generate waterfall for calls with SGPR(inreg) argument (PR #146997)

Thu Jul 24 02:30:39 PDT 2025

================
@@ -1128,6 +1169,45 @@ void SIFixSGPRCopies::lowerVGPR2SGPRCopies(MachineFunction &MF) {
   }
 }
 
+void SIFixSGPRCopies::lowerPysicalSGPRInsts(MachineFunction &MF) {
+  for (auto &Entry : WaterFalls) {
+    MachineInstr *MI = Entry.first;
+    const V2PhysSCopyInfo &Info = Entry.second;
+    assert((Info.MOs.size() != 0 && Info.SGPRs.size() == Info.MOs.size()) &&
+           "Error in MOs or SGPRs size.");
+
+    if (MI->getOpcode() == AMDGPU::SI_CALL_ISEL) {
----------------
jmmartinez wrote:

I don't think `moveToVALUImpl` would be the right place. From my understanding, `moveToVALUImpl` takes a scalar instruction (that produces an uniform value, it has the same value for the whole wave as all scalar instruction do), and moves them into vector equivalent instructions. The instruction changes to vectorial, but the result would still be uniform.

In our case here, we have a `@foo(sgpr)` function, but we're passing a vector of values. We have to somehow apply `@foo` to every value in the vector. We could iterate over every value of the vgpr, but the more efficient way is to do a waterfall loop.

---

> legalizeOperands......... If we want to do this, we will have to pass extra parameters to indicate that the CALL need to be modified.

What do you mean by extra parameters? The list of physreg? In `SIInstrInfo::legalizeOperands`, can't we iterate over the registers, check which arguments are scalar copies from vector registers, and call `loadMBUFScalarOperandsFromVGPR` ?

https://github.com/llvm/llvm-project/pull/146997