[PATCH] D45199: AArch64: Allow offsets to be folded into addresses with ELF.

Sun Apr 8 21:41:50 PDT 2018

pcc added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/arm64-atomic-128.ll:32-33

-; CHECK-DAG: stp    [[DEST_REGLO]], [[DEST_REGHI]]
+; CHECK-DAG: str    [[DEST_REGLO]], [{{.*}}, :lo12:var]
+; CHECK-DAG: str    [[DEST_REGHI]], [{{.*}}, :lo12:var+8]
   %val = atomicrmw nand i128* %p, i128 %bits release
----------------
t.p.northover wrote:
> It's a bit of a shame to lose this optimization. Did you look into enhancing AArch64LoadStoreOptimizer.cpp to account for the new instruction shapes?
I hacked on the load store optimizer to teach it to pair merge and zero store merge instructions with reg+globaladdr operands. It ended up making the pass significantly more complicated (+ ~100 lines), and only ended up reducing the size of .text in Chromium for Android by 1KB (i.e. it fired only about 250 times). So despite the impression that one might get from looking at the test cases, it seems like such an optimization wouldn't be pulling its weight in real world code.

I imagine that in the vast majority of cases, the potentially pair mergeable instructions do not use the address of a global but rather a parameter or a value loaded from memory, which will of course be in a register and therefore still pair mergeable.

I can share the patch if you'd like, but my initial conclusion is that it doesn't seem worth the added complexity to me.

https://reviews.llvm.org/D45199