[llvm] [AMDGPU] Compiler should synthesize private buffer resource descriptor from flat_scratch_init (PR #79586)

Wed Jan 31 10:09:28 PST 2024

================
@@ -829,11 +834,24 @@ void SIFrameLowering::emitEntryFunctionScratchRsrcRegSetup(
       .addImm(Rsrc23 >> 32)
       .addReg(ScratchRsrcReg, RegState::ImplicitDefine);
   } else if (ST.isAmdHsaOrMesa(Fn)) {
-    assert(PreloadedScratchRsrcReg);
 
-    if (ScratchRsrcReg != PreloadedScratchRsrcReg) {
-      BuildMI(MBB, I, DL, TII->get(AMDGPU::COPY), ScratchRsrcReg)
-          .addReg(PreloadedScratchRsrcReg, RegState::Kill);
+    if (FlatScratchInit) {
+      I = BuildMI(MBB, I, DL, TII->get(AMDGPU::COPY),
+                  TRI->getSubReg(ScratchRsrcReg, AMDGPU::sub0_sub1))
+              .addReg(FlatScratchInit)
+              .addReg(ScratchRsrcReg, RegState::ImplicitDefine);
+      I = BuildMI(MBB, I, DL, TII->get(AMDGPU::S_MOV_B64),
+                  TRI->getSubReg(ScratchRsrcReg, AMDGPU::sub2_sub3))
+              .addImm(0xf0000000)
----------------
alex-t wrote:

>From our email conversation:
My question: We have a specialized function that forms the RSRC 48-127 bits SIInstrInfo::getScratchRsrcWords23. Why does not it used for the amdhsa? In other words, do we really always have same constant value in 48-127?

Your answer:  Is a bit historical. Theoretically we were supposed to be able to control a few of the fields in there, but the runtime never implemented it

I went this function in the debugger with "-mtriple=amdgcn-amd-amdhsa" and it returns something different from the 0xf000000. Same time, I have passed PSDB with the 0xf000000 hardcoded.

So, my question is: if it is known that high bits are ignored, could I just call the getScratchRsrcWords23() without revising its logic?

https://github.com/llvm/llvm-project/pull/79586