[PATCH] D109131: [GlobalISel] Add a store-merging optimization pass and enable for AArch64.

Amara Emerson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 1 22:42:07 PDT 2021


aemerson created this revision.
aemerson added reviewers: paquette, arsenm, foad, Petar.Avramovic, jroelofs, qcolombet, gargaroff.
aemerson added a project: LLVM.
Herald added subscribers: steven.zhang, jfb, hiraditya, kristof.beyls, rovka, mgorny.
aemerson requested review of this revision.
Herald added a subscriber: wdng.

This is a first attempt at a constant value consecutive store merging pass, a counterpart to the DAGCombiner's store merging optimization.

The high level goals of this pass:

1. Have a simple and efficient algorithm. As close to linear time as we can get. Thus, prioritizing scalability of the algorithm over merging every corner case we can find. The DAGCombiner's store merging code has been the source of compile time and complexity issues in the past and I wanted to avoid that.
2. Don't introduce any new data structures for ordering memory operations. In MIR, we don't have the concept of chains like we do in the DAG, and the instruction order is stricter than enforcing ordering with graph edges. Although I considered adding something similar, I couldn't justify the overhead.

The pass is current split into 3 main parts. The main store merging code focuses on identifying candidate stores and managing the candidate group that's under consideration for merging. Analyzing addressing of stores is a potentially complex part and for now there's just a basic implementation to identify easy cases. Finally, the other main bit of complexity is the alias analysis, which tries to follow the same logic as the DAG's AA.

Currently this implementation only supports merging of constant stores. Stores of arbitrary variables are technically possible with a very small change, but the DAG chooses not to do this. Doing so here makes most code worse since there's extra overhead in merging values into wider registers.

On AArch64 -Os, this optimization results in very minor savings on CTMark.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109131

Files:
  llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h
  llvm/include/llvm/InitializePasses.h
  llvm/lib/CodeGen/GlobalISel/CMakeLists.txt
  llvm/lib/CodeGen/GlobalISel/GlobalISel.cpp
  llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
  llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
  llvm/test/CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll
  llvm/test/CodeGen/AArch64/GlobalISel/store-merging.ll
  llvm/test/CodeGen/AArch64/GlobalISel/store-merging.mir
  llvm/unittests/CodeGen/GlobalISel/CMakeLists.txt
  llvm/unittests/CodeGen/GlobalISel/GISelAliasTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109131.370164.patch
Type: text/x-patch
Size: 80309 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210902/125de64c/attachment.bin>


More information about the llvm-commits mailing list