[PATCH] D42854: [AMDGPU] Suppress redundant waitcnt instrs

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 3 10:47:54 PST 2018


t-tye added a comment.

Can the pass update its internal state while walking the control flow to factor in the consequences of the original waitcnts? That way a decision as to whether a waitcnt is required will take into account these original waitcnts. This means the benefit is obtained regardless of whether the waitcnts are adjacent or separated (even in different BBs).

It seems that a separate pass could be done after the final waitcnts have been decided to collapse adjacent waitncts into a single one if possible. Or perhaps it would be better to postpone inserting the waitcnts until after the dataflow iteration has found a fixed point, at which time any original/deduced waitcnts can be merger if adjacent.


https://reviews.llvm.org/D42854





More information about the llvm-commits mailing list