[PATCH] D78814: AMDGPU: Break read2/write2 search range on a memory fence
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 24 09:43:44 PDT 2020
arsenm created this revision.
arsenm added reviewers: tstellar, rampitec, foad, piotr.
Herald added subscribers: kerbowa, jfb, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl.
This is to fix performance regressions introduced by
86c944d790728891801778b8d98c2c65a83f36a5 <https://reviews.llvm.org/rG86c944d790728891801778b8d98c2c65a83f36a5>.
The old search would collect all potentially mergeable instructions in
the entire block. In this case, the same address is written in
multiple places in the block on the other side of a fence. When sorted
by offset, the two unmergeable, identical addresses would be next to
each other and the merge would give up.
Break the search space when we encounter an instruction we won't be
able to merge across. This will keep the identical addresses in
different merge attempts.
This may also improve compile time by reducing the merge list size.
https://reviews.llvm.org/D78814
Files:
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
llvm/test/CodeGen/AMDGPU/fence-lds-read2-write2.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D78814.259908.patch
Type: text/x-patch
Size: 8340 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200424/a97c3650/attachment.bin>
More information about the llvm-commits
mailing list