[PATCH] D12116: [AArch64] Improve load/store optimizer to handle LDUR + LDR.

Mon Aug 24 11:10:10 PDT 2015

mcrosier added a comment.

Hi Michael,
So, you're saying that performSTORECombine splits 16B stores that cross line boundaries for performance reasons.  AFAICT, this combine is only enabled with Cyclone.  Later the AArch64LoadStore optimizer runs and combines these split stores undoing the optimization performed during ISelLowering.

If I understand things correctly, I think this makes sense.  However, I'd probably implement this in a separate patch and test it on other subtargets (e.g., A57).  Sound reasonable?

================
Comment at: lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp:455
@@ +454,3 @@
+    PairedOffset =
+        PairedIsUnscaled ? PairedOffset / MemSize : PairedOffset * MemSize;
+  }
----------------
mzolotukhin wrote:
> This looks strange. Is it expected that scaled and unscaled offset differ by `MemSize^2`? (in one case you multiply by `MemSize`, in the other - divide)
My assumption is that a scaled offset is always the unscaled offset divide by the MemSize and conversely the unscaled offset is always the scaled offset divided by the MemSize; I believe this is valid.

I.e., 
Unscaled Offset = Scaled Offset * MemSize;
Scaled Offset = Unscaled Offset / MemSize;

What did I miss?  I'm trying to simply the logic so that both offsets are either scaled or unscaled, but not a mix of the two.  I'm guessing you're concerned my conversion is incorrect.

Repository:
  rL LLVM

http://reviews.llvm.org/D12116