[PATCH] D72737: [AMDGPU] Bundle loads before post-RA scheduler
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 15 08:16:30 PST 2020
rampitec added a comment.
In D72737#1821378 <https://reviews.llvm.org/D72737#1821378>, @foad wrote:
> There are nice changes in a bunch of tests, where we're preserving clusters instead of breaking them apart.
>
> But there are also strange changes in some other tests, where the clustering hasn't changed, but some instructions that use the result of a load have moved around. Does this mean we're getting the latency of the load wrong now? (Or were we getting it wrong before?) For example:
> insert_vector_elt
> llvm.maxnum.f16.ll
> saddo.ll
> sign_extend.ll
We have moved uses of loaded values further from their loads, which is good. As far as I understand these changes are inducted by the removal of artificial edges which were created by MemOpClusterMutation. These edges were linking successors of any load to all the nodes in a cluster and restricted the scheduling.
In sign_extend.ll that is because of the store clustering, we have moved v_ashrrev_i32_e32 producing v2 past v_ashrrev_i32_e32 producing v3 because store cluster uses them in this order. Before it was harder to do because of the artificial edges linking all predecessors to all stores.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D72737/new/
https://reviews.llvm.org/D72737
More information about the llvm-commits
mailing list