[PATCH] D103597: [AArch64] Modified AArch64LoadStoreOptimizer to generate STP instructions for memcpys

Sjoerd Meijer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 3 02:14:22 PDT 2021


SjoerdMeijer added a comment.

Hi Meera,

Thanks for working on this.

Correct me if I am wrong, but I think the problem here is that the LDP/STP creation is very fragile. The load/store optimiser relies on the pairs to be in order, otherwise it won't recognise them. For loads/stores with a bigger alignment (than usual), there is an optimisation in ISel that optimises the chain, and removes the TokenFactor that normally glues these things together, which is sensible optimisation. There have been some attempts to fix to keep the loads/stores for memcpy together in different places, but that is part of the fragile story. Your approach here is to perform renaming earlier for a sequence like this:

  LDR w1
  STR w1
  LDR w1
  STR w1
   

So that we get:

  LDR w1
  STR w1
  LDR w2
  STR w2

and can create a `LDP w1, w2` and `STP w1, w2`.

I think it is worth clarifying some of these things, both in the description of the patch and a code comment.

Ideally we want to generate a LDP and STP for the test cases that you added. Why do we not yet get the LDP?

Speaking about the tests, I think they should be MIR tests. Then we see better what's going on, like the order of instructions etc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103597/new/

https://reviews.llvm.org/D103597



More information about the llvm-commits mailing list