[PATCH] D79792: [AMDGPU] New SIInsertHardClauses pass

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 13 01:02:41 PDT 2020


rampitec added a comment.

In D79792#2033282 <https://reviews.llvm.org/D79792#2033282>, @foad wrote:

> In D79792#2033138 <https://reviews.llvm.org/D79792#2033138>, @rampitec wrote:
>
> > Actually depending on shouldCluster is a problem. First soft clauses are formed and then you turn them into hard clauses. Soft clauses impose higher register pressure but then you simply break them here, essentially wasting registers. It is either both passes should be guided by the same heuristic or none.
>
>
> I don't understand what you're suggesting. In this pass we have to impose a limit of 64 instructions for correctness, not performance, because that's how the s_clause instruction's operand is encoded.
>
> Perhaps we should teach shouldCluster not to bother clausing more than 64 loads, because the s_clause won't be able to honor it, but it already has a much lower limit than that, so I don't think there's any need to change it.


In fact it seems clauses are broken every 4 instructions. Which means there is no reason to form up to 15 instructions into a soft clause which will then be broken here. So what I propose: either do not call shallClustetMemOps here, or also call it in SIFormMemoryClauses. I do no really have a preference here, but disagreement results in registers used by soft clauses were used for nothing. Does it make sense? I.e. consider you also have xnack enabled.

JBTW what happens to counters if hard clause exceeds 16 instructions? Every instruction increases the counter and there are only 4 bits available. Do I get it wrong?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79792/new/

https://reviews.llvm.org/D79792





More information about the llvm-commits mailing list