[PATCH] D108988: [ARM] Simplify address calculation for NEON load/store

Thu Sep 2 02:27:30 PDT 2021

dmgreen added a comment.

My first thought was why can't this be handled by LSR, but I can see how that might not work very well trying to precisely match VLD offsets. And the tests you added have no loops :)

(I also had some thoughts about whether this was useful in general, or if a sufficiently powerful cpu would break these into microops in either case, leading to the same performance in the end. But the code does look cleaner now, I can see how it would improve things)

The way we handled this in MVE was to "distribute" the increments in the ARMLoadStoreOptimizer pass. The instructions in MVE are different, and that does involve checking through Machine Instructions for Adds that can be better distributed into postinc instructions. LSR got it mostly right, DAG Combine did an OKish job most of the time, and we fixed up what went wrong later in the pipeline.

It seems to have worked out OK as far as I can tell, is there a reason we can't do the same thing here? Adding the new pass seems fine if we need it, but I'm less sanguine about having to disable a lot of Add folds in DAGCombiner.

================
Comment at: llvm/test/CodeGen/ARM/arm-post-indexing-opt.ll:1-2
+; RUN: llc -o - < %s | FileCheck --check-prefix=ASM %s
+; RUN: opt --arm-post-indexing-opt -S -o - < %s | FileCheck --check-prefix=IR %s
+
----------------
It is best not to mix llc and opt test files. They are easier kept as separate tests, with autogenerated check lines in each.
It's also good to show before and after in the review by pre-committing the tests, which makes the review easier by making it obvious what has changed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108988/new/

https://reviews.llvm.org/D108988