[PATCH] D130481: [RISCV] Add the GlobalMerge pass (disabled by default)

Mon Aug 1 08:53:47 PDT 2022

asb added a comment.

Re EnableGlobalMerge - I'm happy to make it just a bool option for this patch if that's preferable, but the follow-on patch that enables it should really change it back to the tri-value boolOrDefault in my opinion in order to match the equivalent option on AArch64 and Arm (and because the logic takes advantage of all three values). Let me know what you think.

================
Comment at: llvm/test/CodeGen/RISCV/global-merge-offset.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: sed 's/ArrSize/100/g' %s | llc -mtriple=riscv32 -riscv-enable-global-merge \
+; RUN:   -verify-machineinstrs | FileCheck %s
----------------
reames wrote:
> The use of sed here is simply overly complicating things.
> 
> Please add a second test to this file which uses a separate global with an alternate size.  
The use of sed is because it's not very easy to construct a test (in a way that remains easily readable at least) that demonstrates the global merging behaviour.

The file I started with did:

```
@ga1 = dso_local global [510 x i32] zeroinitializer, align 4
@gi1 = dso_local global i32 0, align 4

; TODO: It would be better for codesize if the final store below was
; `sw a0, 0(a2)`.

define void @f_mergeable(i32 %a) nounwind {
; CHECK-LABEL: f_mergeable:
; CHECK:       # %bb.0:
; CHECK-NEXT:    lui a1, %hi(ga1+2040)
; CHECK-NEXT:    sw a0, %lo(ga1+2040)(a1)
; CHECK-NEXT:    lui a1, %hi(.L_MergedGlobals)
; CHECK-NEXT:    sw a0, %lo(.L_MergedGlobals)(a1)
; CHECK-NEXT:    ret
  %ga1_end = getelementptr inbounds [510 x i32], ptr @ga1, i32 0, i64 510
  store i32 %a, ptr %ga1_end, align 4
  store i32 %a, ptr @gi1, align 4
  ret void
}

@ga2 = dso_local global [511 x i32] zeroinitializer, align 4
@gi2 = dso_local global i32 0, align 4

define void @f_unmergeable(i32 %a) nounwind {
; CHECK-LABEL: f_unmergeable:
; CHECK:       # %bb.0:
; CHECK-NEXT:    lui a1, %hi(ga2+2040)
; CHECK-NEXT:    sw a0, %lo(ga2+2040)(a1)
; CHECK-NEXT:    lui a1, %hi(.L_MergedGlobals+4)
; CHECK-NEXT:    sw a0, %lo(.L_MergedGlobals+4)(a1)
; CHECK-NEXT:    ret
  %ga2_end = getelementptr inbounds [511 x i32], ptr @ga2, i32 0, i64 510
  store i32 %a, ptr %ga2_end, align 4
  store i32 %a, ptr @gi2, align 4
  ret void
}
```

But you can see the globals merging pass merged gi1 and gi2, making the test not so useful (room for improved heuristics in the pass too perhaps...).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130481/new/

https://reviews.llvm.org/D130481