[PATCH] D23364: AMDGPU: Set sizes of spill pseudos

Thu Aug 11 03:29:26 PDT 2016

nhaehnle added a subscriber: nhaehnle.

================
Comment at: lib/Target/AMDGPU/SIInstructions.td:1962-1963
@@ -1959,1 +1961,4 @@
+
+      // (2 * 4) + (8 * num_subregs) bytes maximum
+      let Size = !add(!shl(!srl(vgpr_class.Size, 5), 3), 8);
     }
----------------
I just took a look, and for some reason the reloads tend to look like

        buffer_load_dword v3, off, s[72:75], s70 offset:1444 ; 16-byte Folded Reload
                                        ; encoding: [0xa4,0x05,0x30,0xe0,0x00,0x03,0x12,0x46]
        s_waitcnt vmcnt(0)              ; encoding: [0x70,0x0f,0x8c,0xbf]
        buffer_load_dword v4, off, s[72:75], s70 offset:1448 ; 16-byte Folded Reload
                                        ; encoding: [0xa8,0x05,0x30,0xe0,0x00,0x04,0x12,0x46]
        s_waitcnt vmcnt(0)              ; encoding: [0x70,0x0f,0x8c,0xbf]

etc., so you actually get 12 bytes per dword. Not sure if that's a problem, especially since those waits are really wrong anyway (perhaps the wait insertion gets confused by the register/subregister relationship?).


https://reviews.llvm.org/D23364