[PATCH] D159480: [Clang][AArch64] Fine-grained ldp and stp policies.
    Manos Anagnostakis via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Fri Sep  8 00:54:41 PDT 2023
    
    
  
manosanag added a comment.
Hello Dave,
thanks for replying.
Yes, this is an optimization.
On some AArch64 cores, including Ampere's ampere1 architecture that this is targeted for, load/store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded. Based on the performance of various benchmarks, emitting ldp/stp instructions was disabled on GCC at some point (discussion is https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615672.html). This patch improves on that and offers control over when the instructions are used.
Similar patch with the same flags has been recently submitted for review in the GCC mailing lists (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628590.html).
I have a fix ready for the fortran regressions shown by autotesting. I can include some of this information to the commit message of the diff.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159480/new/
https://reviews.llvm.org/D159480
    
    
More information about the llvm-commits
mailing list