[PATCH] D79792: [AMDGPU] New SIInsertHardClauses pass

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 13 09:44:35 PDT 2020


rampitec added a comment.

In D79792#2033414 <https://reviews.llvm.org/D79792#2033414>, @foad wrote:

> Using gfx10 terminology: SIFormMemoryClauses deals with restartable //groups// but SIInsertHardClauses deals with //hard clauses//. They are similar but not the same. That's what I tried to explain in the big comment at the top of SIInsertHardClauses.cpp. Hard clauses are all about performance. Groups are all about correctness in the presence of XNACK.


SIFormMemoryClauses is an optimization pass. If it does not run nothing breaks, just clauses will be broken by the hazard recognizer. So in fact it does the same thing.

> shouldClusterMemOps is a heuristic to decide whether loads should be claused for performance. We have to call it from SIInsertHardClauses, so it can tell us not to bother clausing loads that have a different base address.

Actually a very small limit set by shouldClusterMemOps is driven by the register pressure. There seems to be no reason to call it here at all because this is past RA.
Note that SIFormMemoryClauses does not call it but uses RP tracker to maintain occupancy instead.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79792/new/

https://reviews.llvm.org/D79792





More information about the llvm-commits mailing list