[llvm] [RISCV] Enable merging of external globals by default (PR #117880)

Alex Bradbury via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 27 05:34:34 PST 2024


https://github.com/asb created https://github.com/llvm/llvm-project/pull/117880

This follows up #115495 by enabling merging of external globals by default, which had been left as a next step in order to make the previous change more incremental and so we can more easily narrow down on any identified regressions.

Enabling merging of external globals matches what Arm does (for non mach-o targets), though AArch64 doesn't as there were [some concerns](https://reviews.llvm.org/D61947) it might cause regressions in some cases.

Here are the results I have from specrate benchmarks:
|                 | oldmerge mean | oldmerge stdev | newmerge mean | newmerge stdev | abs old vs new change | old vs new change as % |
| --------------- | ------------- | -------------- | ------------- | -------------- | --------------------- | ---------------------- |
| 500.perlbench_r | 159.6         | 0.4            | 159.7         | 0.6            | 0.1                   | 0.06%                  |
| 502.gcc_r       | 408.8         | 2.3            | 409.7         | 2.4            | 0.9                   | 0.22%                  |
| 505.mcf_r       | 619.4         | 3.6            | 620.6         | 4.1            | 1.2                   | 0.19%                  |
| 520.omnetpp_r   | 573.7         | 5              | 569.1         | 2.4            | \-4.6                 | \-0.80%                |
| 525.x264_r      | 183.9         | 0.2            | 184           | 0.4            | 0.1                   | 0.05%                  |
| 531.deepsjeng_r | 280.6         | 0.9            | 280           | 0.7            | \-0.6                 | \-0.21%                |
| 541.leela_r     | 284.6         | 1.1            | 284.5         | 0.9            | \-0.1                 | \-0.04%                |
| 557.xz_r        | 208.8         | 3              | 209.5         | 3.2            | 0.7                   | 0.34%                  |

This is on the Spacemit X60, with essentially rva22+v. omnetpp is the main one that sees a potentially positive change, though there was an annoyingly high standard deviation on that benchmark.

Looking at dhyrstone we see a positive codegen change in Proc0 which translates to a reduced instruction count, though not decreased runtime:

```
$ perf stat -r100 ./dry.oldmerge

 Performance counter stats for './dry.oldmerge' (100 runs):

            815.02 msec task-clock:u                     #    0.999 CPUs utilized            ( +-  0.02% )
                 3      context-switches:u               #    3.681 /sec                     ( +-  2.75% )
                 0      cpu-migrations:u                 #    0.000 /sec
                46      page-faults:u                    #   56.443 /sec
     1,304,052,614      cycles:u                         #    1.600 GHz                      ( +-  0.02% )
     1,601,705,197      instructions:u                   #    1.23  insn per cycle           ( +-  0.00% )
       100,021,180      branches:u                       #  122.728 M/sec                    ( +-  0.00% )
             3,427      branch-misses:u                  #    0.00% of all branches          ( +-  0.68% )

          0.815862 +- 0.000137 seconds time elapsed  ( +-  0.02% )

$ perf stat -r100 ./dry.newmerge

 Performance counter stats for './dry.newmerge' (100 runs):

            815.29 msec task-clock:u                     #    0.999 CPUs utilized            ( +-  0.00% )
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                46      page-faults:u                    #   56.429 /sec
     1,304,461,619      cycles:u                         #    1.600 GHz                      ( +-  0.00% )
     1,501,837,638      instructions:u                   #    1.15  insn per cycle           ( +-  0.00% )
       100,021,173      branches:u                       #  122.697 M/sec                    ( +-  0.00% )
             2,813      branch-misses:u                  #    0.00% of all branches          ( +-  0.82% )

         0.8161623 +- 0.0000465 seconds time elapsed  ( +-  0.01% )
```

>From 0496e16aabede7370a3872fac38e39c69014fe5a Mon Sep 17 00:00:00 2001
From: Alex Bradbury <asb at igalia.com>
Date: Wed, 27 Nov 2024 13:41:15 +0100
Subject: [PATCH] [RISCV] Enable merging of external globals by default

---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp         | 11 ++++-------
 .../RISCV/global-merge-minsize-smalldata-nonzero.ll  |  7 +++----
 .../RISCV/global-merge-minsize-smalldata-zero.ll     |  7 +++----
 llvm/test/CodeGen/RISCV/global-merge-minsize.ll      |  4 ++--
 llvm/test/CodeGen/RISCV/global-merge-offset.ll       |  8 ++++----
 llvm/test/CodeGen/RISCV/global-merge.ll              | 12 ++++++------
 6 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fa507653264ccd..7a2e2a90699553 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -51,12 +51,9 @@ static cl::opt<cl::boolOrDefault>
     EnableGlobalMerge("riscv-enable-global-merge", cl::Hidden,
                       cl::desc("Enable the global merge pass"));
 
-static cl::opt<bool> ForceEnableGlobalMergeExternalGlobals(
-    "riscv-force-enable-global-merge-external-globals", cl::Hidden,
-    cl::init(false),
-    cl::desc(
-        "If the global merge pass is enabled, force enable global merging of "
-        "external globals (overriding any logic that might disable it)"));
+static cl::opt<bool> DisableGlobalMergeExternalGlobals(
+    "riscv-disable-global-merge-external-globals", cl::Hidden, cl::init(false),
+    cl::desc("Disable global merging of external globals"));
 
 static cl::opt<bool>
     EnableMachineCombiner("riscv-enable-machine-combiner",
@@ -484,7 +481,7 @@ bool RISCVPassConfig::addPreISel() {
     addPass(createGlobalMergePass(TM, /* MaxOffset */ 2047,
                                   /* OnlyOptimizeForSize */ false,
                                   /* MergeExternalByDefault */
-                                  ForceEnableGlobalMergeExternalGlobals));
+                                  !DisableGlobalMergeExternalGlobals));
   }
 
   return false;
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
index 156a34db827456..77fe1783bb5d5e 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
@@ -1,8 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv64 -riscv-force-enable-global-merge-external-globals \
-; RUN:   -verify-machineinstrs < %s | FileCheck %s -check-prefix=SMALL-DATA
-; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=0 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=SMALL-DATA
+; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=0 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefix=MINSIZE
 
 @ig1 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
index e41f14d394b7ca..c29749c17a5b5c 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
@@ -1,8 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -riscv-force-enable-global-merge-external-globals \
-; RUN:   -verify-machineinstrs < %s | FileCheck %s -check-prefix=SMALL-DATA
-; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=5 \
-; RUN:   -riscv-force-enable-global-merge-external-globals  -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefix=SMALL-DATA
+; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=5 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck %s -check-prefix=MINSIZE
 
 @ig1 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
index 1d65d9d1732ba3..915dde388cffdc 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
@@ -1,8 +1,8 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -riscv-force-enable-global-merge-external-globals \
+; RUN: llc -mtriple=riscv32 \
 ; RUN:   -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32
 ; RUN: llc -mtriple=riscv32 -global-merge-min-data-size=5 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32-MINSIZE
+; RUN:   -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32-MINSIZE
 
 @ig1 = internal global i32 0, align 4
 @ig2 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-offset.ll b/llvm/test/CodeGen/RISCV/global-merge-offset.ll
index bb8264ee438545..c1074bc8ca97ed 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-offset.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-offset.ll
@@ -1,12 +1,12 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: sed 's/ArrSize/100/g' %s | llc -mtriple=riscv32 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s
+; RUN:   -verify-machineinstrs | FileCheck %s
 ; RUN: sed 's/ArrSize/100/g' %s | llc -mtriple=riscv64 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s
+; RUN:   -verify-machineinstrs | FileCheck %s
 ; RUN: sed 's/ArrSize/101/g' %s | llc -mtriple=riscv32 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
+; RUN:   -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
 ; RUN: sed 's/ArrSize/101/g' %s | llc -mtriple=riscv64 \
-; RUN:   -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
+; RUN:   -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
 
 ; This test demonstrates that the MaxOffset is set correctly for RISC-V by
 ; constructing an input that is at the limit and comparing.
diff --git a/llvm/test/CodeGen/RISCV/global-merge.ll b/llvm/test/CodeGen/RISCV/global-merge.ll
index 327a6b54f4be25..fa0dda3736906b 100644
--- a/llvm/test/CodeGen/RISCV/global-merge.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge.ll
@@ -1,11 +1,11 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv32 \
-; RUN:     -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv32 -riscv-disable-global-merge-external-globals \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv64 -riscv-disable-global-merge-external-globals \
+; RUN:     -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefix=CHECK-WEXTERN %s
-; RUN: llc -mtriple=riscv64 \
-; RUN:     -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefix=CHECK-WEXTERN %s
 
 @ig1 = internal global i32 0, align 4



More information about the llvm-commits mailing list