[llvm] [WebAssembly] Use DebugValueManager only when subprogram exists (PR #77978)

Heejin Ahn via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 12 12:44:45 PST 2024


https://github.com/aheejin created https://github.com/llvm/llvm-project/pull/77978

We previously scanned the whole BB for `DBG_VALUE` instruction even when the program doesn't have debug info, i.e., the function doesn't have a subprogram associated with it, which can make compilation unnecessarily slow. This disables `DebugValueManager` when a `DISubprogram` doesn't exist for a function.

This only reduces unnecessary work in non-debug mode and does not change output, so it's hard to add a test to test this behavior.

`cfg-stackify-dbg-skip.ll` test change is necessary because its `DISubprogram` was not correctly linked with the function `foo`, so with this PR the compiler incorrectly assumed function `foo` didn't have a subprogram and the test started to fail.

Fixes https://github.com/emscripten-core/emscripten/issues/21048.

---

- Further thoughts: This PR, which disables `DebugValueManager` in non-debug mode solves the immediate problem at hand
(https://github.com/emscripten-core/emscripten/issues/21048), but this means the example in that issue will still take a long time to compile when `-g` is given.

The reason why
https://github.com/emscripten-core/emscripten/blob/main/test/printf/test_printf.c (the example in
https://github.com/emscripten-core/emscripten/issues/21048) took really long time was that file contained a single BB with more than 10,000 instructions, and `WebAssemblyDebugValueManager`'s constructor contains a loop that's O(n^2):
https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp#L32-L40

It was written this way to salvage as many `DBG_VALUE`s as possible. For example, `MachineInstr::collectDebugValues` only collects `DBG_VALUE`s that immediately follow a def:
https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/CodeGen/MachineInstr.cpp#L2315-L2329 So that routine cannot detect something like
```ll
%def = ...
... some other instructions ...
DBG_VALUE %def ...
```
which occurs rather frequently. Our `WebAssemblyDebugValueManager` constructor tried to capture those non-consecutive `DBG_VALUE`s as well, but as a result, we scan the whole BB. This did not cause a significant slowdown on programs with normal BB sizes, but can be a problem with this kind of program that contains giant BBs with thousands of instructions. Actually `DBG_VALUE`s that refer to `%def` can even be in other BBs, which even us cannot capture. `DBG_INSTR_REF` (https://llvm.org/docs/InstrRefDebugInfo.html) can be a better solution for this but we don't use it (yet).

We can probably consider imposing some limits on the number of instructions we can after a def. All this applies to the debug mode compilation, so this is not directly related to what this PR does though.

>From a3bbe28d9fe67f2764edbe02bbcea511f81d1472 Mon Sep 17 00:00:00 2001
From: Heejin Ahn <aheejin at gmail.com>
Date: Fri, 12 Jan 2024 00:41:16 -0800
Subject: [PATCH] [WebAssembly] Use DebugValueManager only when subprogram
 exists

We previously scanned the whole BB for `DBG_VALUE` instruction even when
the program doesn't have debug info, i.e., the function doesn't have a
subprogram associated with it, which can make compilation unnecessarily
slow. This disables `DebugValueManager` when a `DISubprogram` doesn't
exist for a function.

This only reduces unnecessary work in non-debug mode and does not change
output, so it's hard to add a test to test this behavior.

`cfg-stackify-dbg-skip.ll` test change is necessary because its
`DISubprogram` was not correctly linked with the function `foo`, so with
this PR the compiler incorrectly assumed function `foo` didn't have a
subprogram and the test started to fail.

Fixes https://github.com/emscripten-core/emscripten/issues/21048.

---

- Further thoughts:
This PR, which disables `DebugValueManager` in non-debug mode solves the
immediate problem at hand
(https://github.com/emscripten-core/emscripten/issues/21048), but this
means the example in that issue will still take a long time to compile
when `-g` is given.

The reason why
https://github.com/emscripten-core/emscripten/blob/main/test/printf/test_printf.c
(the example in
https://github.com/emscripten-core/emscripten/issues/21048) took really
long time was that file contained a single BB with more than 10,000
instructions, and `WebAssemblyDebugValueManager`'s constructor contains
a loop that's O(n^2):
https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp#L32-L40

It was written this way to salvage as many `DBG_VALUE`s as possible.
For example, `MachineInstr::collectDebugValues` only collects
`DBG_VALUE`s that immediately follow a def:
https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/CodeGen/MachineInstr.cpp#L2315-L2329
So that routine cannot detect something like
```ll
%def = ...
... some other instructions ...
DBG_VALUE %def ...
```
which occurs rather frequently. Our `WebAssemblyDebugValueManager`
constructor tried to capture those non-consecutive `DBG_VALUE`s as well,
but as a result, we scan the whole BB. This did not cause a significant
slowdown on programs with normal BB sizes, but can be a problem with
this kind of program that contains giant BBs with thousands of
instructions. Actually `DBG_VALUE`s that refer to `%def` can even be in
other BBs, which even us cannot capture. `DBG_INSTR_REF`
(https://llvm.org/docs/InstrRefDebugInfo.html) can be a better solution
for this but we don't use it (yet).

We can probably consider imposing some limits on the number of
instructions we can after a def. All this applies to the debug mode
compilation, so this is not directly related to what this PR does
though.
---
 llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp | 4 ++++
 llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll       | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
index fd510f85a8a37a..a2a054127d5f65 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
@@ -17,11 +17,15 @@
 #include "WebAssemblyMachineFunctionInfo.h"
 #include "llvm/CodeGen/MachineInstr.h"
 #include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Function.h"
 
 using namespace llvm;
 
 WebAssemblyDebugValueManager::WebAssemblyDebugValueManager(MachineInstr *Def)
     : Def(Def) {
+  if (!Def->getMF()->getFunction().getSubprogram())
+    return;
+
   // This code differs from MachineInstr::collectDebugValues in that it scans
   // the whole BB, not just contiguous DBG_VALUEs, until another definition to
   // the same register is encountered.
diff --git a/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll b/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
index a10b9bfdc71af3..5a951d9043df29 100644
--- a/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
+++ b/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
@@ -15,7 +15,7 @@
 
 target triple = "wasm32-unknown-unknown"
 
-define void @foo(i64 %arg) {
+define void @foo(i64 %arg) !dbg !37 {
 start:
   %val = trunc i64 %arg to i32
   %cmp = icmp eq i32 %val, 0
@@ -39,7 +39,7 @@ declare void @llvm.dbg.value(metadata, metadata, metadata)
 !22 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "&str", file: !6, size: 64, align: 32, elements: !{}, identifier: "111094d970b097647de579f9c509ef08")
 !33 = !{i32 2, !"Debug Info Version", i32 3}
 !35 = distinct !DILexicalBlock(scope: !37, file: !6, line: 357, column: 8)
-!37 = distinct !DISubprogram(name: "foobar", linkageName: "_fooba", scope: !38, file: !6, line: 353, type: !39, isLocal: true, isDefinition: true, scopeLine: 353, flags: DIFlagPrototyped, isOptimized: true, unit: !0, templateParams: !2, retainedNodes: !42)
+!37 = distinct !DISubprogram(name: "foo", scope: !6, file: !6, line: 353, type: !39, isLocal: true, isDefinition: true, scopeLine: 353, flags: DIFlagPrototyped, isOptimized: true, unit: !0, templateParams: !2, retainedNodes: !42)
 !38 = !DINamespace(name: "ptr", scope: null)
 !39 = !DISubroutineType(types: !2)
 !42 = !{!46}



More information about the llvm-commits mailing list