[PATCH] D79792: [AMDGPU] New SIInsertHardClauses pass
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 13 00:30:13 PDT 2020
foad added a comment.
In D79792#2033138 <https://reviews.llvm.org/D79792#2033138>, @rampitec wrote:
> Actually depending on shouldCluster is a problem. First soft clauses are formed and then you turn them into hard clauses. Soft clauses impose higher register pressure but then you simply break them here, essentially wasting registers. It is either both passes should be guided by the same heuristic or none.
I don't understand what you're suggesting. In this pass we have to impose a limit of 64 instructions for correctness, not performance, because that's how the s_clause instruction's operand is encoded.
Perhaps we should teach shouldCluster not to bother clausing more than 64 loads, because the s_clause won't be able to honor it, but it already has a much lower limit than that, so I don't think there's any need to change it.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79792/new/
https://reviews.llvm.org/D79792
More information about the llvm-commits
mailing list