[PATCH] D84354: [AMDGPU/MemOpsCluster] Clean-up fixme's around mem ops clustering logic

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 29 04:04:56 PDT 2020


hsmhsm added a comment.

In D84354#2180481 <https://reviews.llvm.org/D84354#2180481>, @rampitec wrote:

> This is fine in principle, but will create too big clusters for sub-dword loads. In case of byte loads you could cluster 32 instructions and that will for sure stall because you would overflow a 4-bit counter. Moreover, a sub-dword load will likely occupy a whole VGPR anyway, so in that scenario you will even consume 32 registers, while you are looking into consuming not more than 8.
>
> In addition to the byte size you also need to limit a number of loads. Our previous experience showed visible benefits up to 5-6 loads. Also note the case of 16 bit loads. Assume you have limited num-loads to 6. Now you would likely need 12 registers to hold it which is still more than 8 you had in mind. How about this:
>
>   LoadSize = NumBytes / NumLoads;
>   NumDWORDs = (LoadSize + 3) / 4;
>   return NumDWORDs <= 8;
>
> You would still have the same logic for wide loads but limit it to 8 loads for sub-dword case.

Thanks for the further insights here, yes, I agree with you. However, I think, the proposed above heuristic still have some problems. Let's take below example. Let's assume there are N load operations, each loads 16 bytes, are put into clustering process. Then below is the scenario:

| NumLoads | NumBytes | LoadSize | NumDWORDs | NumDWORDs <= 8 | Decision |
| 2        | 32       | 16       | 4         | 4 <= 8         | cluster  |
| 3        | 48       | 16       | 4         | 4 <= 8         | cluster  |
| 4        | 64       | 16       | 4         | 4 <= 8         | cluster  |
| N        | 64 * N   | 16       | 4         | 4 <= 8         | cluster  |
|

So, we land-up clustering all the N loads. Is it that you meant `NumDWORDs = ((LoadSize + 3) / 4) * NumLoads` instead of   `NumDWORDs = (LoadSize + 3) / 4`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84354/new/

https://reviews.llvm.org/D84354



More information about the llvm-commits mailing list