[PATCH] D159480: [Clang][AArch64] Fine-grained ldp and stp policies.

Manos Anagnostakis via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Sep 8 00:54:41 PDT 2023


manosanag added a comment.

Hello Dave,

thanks for replying.

Yes, this is an optimization.

On some AArch64 cores, including Ampere's ampere1 architecture that this is targeted for, load/store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded. Based on the performance of various benchmarks, emitting ldp/stp instructions was disabled on GCC at some point (discussion is https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615672.html). This patch improves on that and offers control over when the instructions are used.

Similar patch with the same flags has been recently submitted for review in the GCC mailing lists (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628590.html).

I have a fix ready for the fortran regressions shown by autotesting. I can include some of this information to the commit message of the diff.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159480/new/

https://reviews.llvm.org/D159480



More information about the cfe-commits mailing list