[llvm] [AArch64] merge index address with large offset into base address (PR #72187)

Thu Nov 30 05:43:55 PST 2023

================
@@ -47,17 +46,16 @@ define void @test2(ptr %struct, i32 %n) {
 ; CHECK:       // %bb.0: // %entry
 ; CHECK-NEXT:    cbz x0, .LBB1_3
 ; CHECK-NEXT:  // %bb.1: // %while_cond.preheader
-; CHECK-NEXT:    mov w8, #40000 // =0x9c40
-; CHECK-NEXT:    mov w9, wzr
-; CHECK-NEXT:    add x8, x0, x8
-; CHECK-NEXT:    cmp w9, w1
+; CHECK-NEXT:    mov w8, wzr
+; CHECK-NEXT:    cmp w8, w1
 ; CHECK-NEXT:    b.ge .LBB1_3
 ; CHECK-NEXT:  .LBB1_2: // %while_body
 ; CHECK-NEXT:    // =>This Inner Loop Header: Depth=1
-; CHECK-NEXT:    str w9, [x8, #4]
-; CHECK-NEXT:    add w9, w9, #1
-; CHECK-NEXT:    str w9, [x8]
-; CHECK-NEXT:    cmp w9, w1
+; CHECK-NEXT:    add x9, x0, #9, lsl #12 // =36864
----------------
davemgreen wrote:

Do you know why this add is not being hoisted out of the loop? It could lead to some performance regressions if it is kept in. I think this same thing is happening in the TSVC benchmarks from the llvm-test-suite.

In this case the add is loop invariant. Could there be other cases where we split a non-loop-invariant add in a loop, leading to more instructions in the loop? The immediate could be moved out and kept in a register in the original version.

https://github.com/llvm/llvm-project/pull/72187