[PATCH] D103597: [AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 9 00:46:45 PDT 2021


SjoerdMeijer accepted this revision.
SjoerdMeijer added a comment.
This revision is now accepted and ready to land.

LGTM

Perhaps you can elaborate a little bit in the commit message:

> When the order of loads and stores has been optimized during ISel, it prevents LDPs/STPs from being generated, so we attempt to rename registers at an earlier stage so that the instructions can be recognised as a pair in this order.

It's not clear which loads/stores. Perhaps you can expand a little bit on this, and say something along the lines of:

Our motivating case initially were memcpy's with an alignment > 16. The loads/stores to which small memcpy's expand to, are tried to be kept together in several places so that we get a sequence like this for a 64 bits copy:

  LD w0
  LD w1
  ST w0
  ST w1

The load/store optimiser can fold this in a `LDP/STP w0, w1` because the registers read/written are consecutive. In our case however, the loads and stores chains has been optimised during ISel so that we end up with a sequence like this:

  LD w0
  ST w0
  LD w0
  ST w0

This instruction reordering/scheduling allows reuse of registers, and since the registers of the loads/stores are no longer consecutive (i.e. they are the same), it inhibits LDP/STP creation. The approach here is to perform renaming:

  LD w0
  ST w0
  LD w1
  ST w1

enabling the folding of the stores into a STP.  We do not yet generate the LDP due to a limitation in the renaming implementation, but plan to look at that in a follow up so that we fully support this case. And while this was initially motivated by certain memcpy's, this is a general approach and thus beneficial for other cases too as can be seen in some test changes.



================
Comment at: llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:1706
+        // TODO: This is currently supported for STPs, LDPs are not
+        // being generated yet
+        if (TRI->isSuperOrSubRegisterEq(Reg, getLdStRegOp(MI).getReg())) {
----------------
Nit: `yet` -> `yet.`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103597/new/

https://reviews.llvm.org/D103597



More information about the llvm-commits mailing list