[PATCH] D42854: [AMDGPU] Suppress redundant waitcnt instrs

Mark Searles via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 4 12:02:41 PST 2018


msearles added a comment.

In https://reviews.llvm.org/D42854#997187, @t-tye wrote:

> Can the pass update its internal state while walking the control flow to factor in the consequences of the original waitcnts? That way a decision as to whether a waitcnt is required will take into account these original waitcnts. This means the benefit is obtained regardless of whether the waitcnts are adjacent or separated (even in different BBs).
>
> It seems that a separate pass could be done after the final waitcnts have been decided to collapse adjacent waitncts into a single one if possible. Or perhaps it would be better to postpone inserting the waitcnts until after the dataflow iteration has found a fixed point, at which time any original/deduced waitcnts can be merger if adjacent.


Yes, all of that can be done; it's been on the TODO list for several months; see TODO at line 1531; my intent for this patch was to grab the low-hanging fruit re: redundant waitcnt instrs.


https://reviews.llvm.org/D42854





More information about the llvm-commits mailing list