[llvm] b7c14b6 - [Debugify] Add 'acceptance-test' mode for the debugify report script (#147574)

Thu Jul 17 03:40:47 PDT 2025

Author: Stephen Tozer
Date: 2025-07-17T11:40:43+01:00
New Revision: b7c14b6ded300b9190fe0b65881d04c54b2a9fbd

URL: https://github.com/llvm/llvm-project/commit/b7c14b6ded300b9190fe0b65881d04c54b2a9fbd
DIFF: https://github.com/llvm/llvm-project/commit/b7c14b6ded300b9190fe0b65881d04c54b2a9fbd.diff

LOG: [Debugify] Add 'acceptance-test' mode for the debugify report script (#147574)

For the purposes of setting up CI that makes use of debugify, this patch
adds an alternative mode for the llvm-original-di-preservation.py
script, which produces terminal-friendly(-ish) YAML output instead of an
HTML report, and sets the return code to 1 if the input file contains
errors, or 0 if the input file contains no errors or does not exist,
making it simple to use it in CI.

This introduces a small change in existing usage, in that the path for
the HTML report file is now passed with `--report-file <path>` rather
than as a positional argument; I could make the argparse logic work
without this change, but I believe that is simpler to understand this
way, and to my knowledge debugify isn't currently being used in
automated environments where changing this might cause issues. As a
small change while passing by, I also changed `-compress` to
`--compress`, for consistency.

As a note for reviewers, the reason that we treat a non-existent input
file as a pass is that this is actually the expected state: we use clang
to compile numerous files, passing a filepath for debugify errors. Any
errors found by debugify will be written to this file; if none are
found, the file is untouched. This is also mentioned in a code comment,
but I think it useful to state upfront.

Finally, the justification for adding a new mode to this script instead
of adding a separate script for the separate functionality is that this
script understands debugify's output, and performs some deduplication
that is useful for clarifying the resulting output. Writing a new script
would require duplicating logic unnecessarily, and risks the scripts
falling out-of-sync if changes are made to debugify's output.

Added: 
    llvm/test/tools/llvm-original-di-preservation/acceptance-test.test

Modified: 
    llvm/docs/HowToUpdateDebugInfo.rst
    llvm/test/tools/llvm-original-di-preservation/basic.test
    llvm/utils/llvm-original-di-preservation.py

Removed: 
    


################################################################################
diff  --git a/llvm/docs/HowToUpdateDebugInfo.rst b/llvm/docs/HowToUpdateDebugInfo.rst
index abe21c6794a8a..915e2896023c5 100644

--- a/llvm/docs/HowToUpdateDebugInfo.rst
+++ b/llvm/docs/HowToUpdateDebugInfo.rst
@@ -504,7 +504,7 @@ as follows:
 
 .. code-block:: bash
 
-  $ llvm-original-di-preservation.py sample.json sample.html
+  $ llvm-original-di-preservation.py sample.json --report-file sample.html
 
 Testing of original debug info preservation can be invoked from front-end level
 as follows:

diff  --git a/llvm/test/tools/llvm-original-di-preservation/acceptance-test.test b/llvm/test/tools/llvm-original-di-preservation/acceptance-test.test
new file mode 100644
index 0000000000000..0b8c33d24396a
--- /dev/null
+++ b/llvm/test/tools/llvm-original-di-preservation/acceptance-test.test
@@ -0,0 +1,70 @@
+RUN: not %llvm-original-di-preservation %p/Inputs/sample.json --acceptance-test | FileCheck %s
+CHECK:      DILocation Bugs:
+CHECK-NEXT:   test.ll:
+CHECK-NEXT:     no-name:
+CHECK-NEXT:     - action: not-generate
+CHECK-NEXT:       bb_name: no-name
+CHECK-NEXT:       fn_name: fn
+CHECK-NEXT:       instr: extractvalue
+CHECK-NEXT:     - action: not-generate
+CHECK-NEXT:       bb_name: no-name
+CHECK-NEXT:       fn_name: fn
+CHECK-NEXT:       instr: insertvalue
+CHECK-NEXT:     - action: not-generate
+CHECK-NEXT:       bb_name: no-name
+CHECK-NEXT:       fn_name: fn1
+CHECK-NEXT:       instr: insertvalue
+CHECK-NEXT:     - action: not-generate
+CHECK-NEXT:       bb_name: no-name
+CHECK-NEXT:       fn_name: fn1
+CHECK-NEXT:       instr: extractvalue
+CHECK:      Errors detected for:
+
+RUN: not %llvm-original-di-preservation %p/Inputs/sample.json --acceptance-test --reduce | FileCheck %s --check-prefix=COMPRESS
+COMPRESS:      DILocation Bugs:
+COMPRESS-NEXT:   test.ll:
+COMPRESS-NEXT:     no-name:
+COMPRESS-NEXT:     - action: not-generate
+COMPRESS-NEXT:       bb_name: no-name
+COMPRESS-NEXT:       fn_name: fn
+COMPRESS-NEXT:       instr: extractvalue
+COMPRESS-NEXT:     - action: not-generate
+COMPRESS-NEXT:       bb_name: no-name
+COMPRESS-NEXT:       fn_name: fn
+COMPRESS-NEXT:       instr: insertvalue
+COMPRESS:      Errors detected for:
+
+RUN: not %llvm-original-di-preservation %p/Inputs/origin.json --acceptance-test --reduce | FileCheck %s --check-prefix=ORIGIN
+ORIGIN:      DILocation Bugs:
+ORIGIN-NEXT:   test.ll:
+ORIGIN-NEXT:     LoopVectorizePass:
+ORIGIN-NEXT:     - action: not-generate
+ORIGIN-NEXT:       bb_name: no-name
+ORIGIN-NEXT:       fn_name: fn
+ORIGIN-NEXT:       instr: add
+ORIGIN-NEXT:       origin: |
+ORIGIN-NEXT:         Stack Trace 0:
+ORIGIN-NEXT:          #0 0x00005895d035c935 llvm::DbgLocOrigin::DbgLocOrigin(bool) /tmp/llvm-project/llvm/lib/IR/DebugLoc.cpp:22:9
+ORIGIN-NEXT:          #1 0x00005895d03af013 llvm::DILocAndCoverageTracking::DILocAndCoverageTracking() /tmp/llvm-project/llvm/include/llvm/IR/DebugLoc.h:90:11
+ORIGIN-NEXT:          #2 0x00005895d03af013 llvm::DebugLoc::DebugLoc() /tmp/llvm-project/llvm/include/llvm/IR/DebugLoc.h:133:5
+ORIGIN-NEXT:          #3 0x00005895d03af013 llvm::Instruction::Instruction(llvm::Type*, unsigned int, llvm::User::AllocInfo, llvm::InsertPosition) /tmp/llvm-project/llvm/lib/IR/Instruction.cpp:37:14
+ORIGIN-NEXT:          #4 0x00005895d06862b5 llvm::PHINode::PHINode(llvm::Type*, unsigned int, llvm::Twine const&, llvm::InsertPosition) /tmp/llvm-project/llvm/include/llvm/IR/Instructions.h:0:9
+ORIGIN-NEXT:          #5 0x00005895d06862b5 llvm::PHINode::Create(llvm::Type*, unsigned int, llvm::Twine const&, llvm::InsertPosition) /tmp/llvm-project/llvm/include/llvm/IR/Instructions.h:2651:9
+ORIGIN-NEXT:          #6 0x00005895d06862b5 llvm::InstCombinerImpl::foldPHIArgGEPIntoPHI(llvm::PHINode&) /tmp/llvm-project/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp:617:9
+ORIGIN-NEXT:          #7 0x00005895d0688fe0 llvm::InstCombinerImpl::visitPHINode(llvm::PHINode&) /tmp/llvm-project/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp:1456:22
+ORIGIN-NEXT:          #8 0x00005895d05cd21f llvm::InstCombinerImpl::run() /tmp/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5327:22
+ORIGIN-NEXT:          #9 0x00005895d05d067e combineInstructionsOverFunction(llvm::Function&, llvm::InstructionWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::BranchProbabilityInfo*, llvm::ProfileSummaryInfo*, llvm::InstCombineOptions const&) /tmp/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5643:31
+ORIGIN-NEXT:         #10 0x00005895d05cf9a9 llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager&) /tmp/llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp:5706:8
+ORIGIN-NEXT:         #11 0x00005895d107d07d llvm::detail::PassModel>::run(llvm::Function&, llvm::AnalysisManager&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:5
+ORIGIN-NEXT:         #12 0x00005895d04204a7 llvm::PassManager>::run(llvm::Function&, llvm::AnalysisManager&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerImpl.h:85:8
+ORIGIN-NEXT:         #13 0x00005895ce4cb09d llvm::detail::PassModel>, llvm::AnalysisManager>::run(llvm::Function&, llvm::AnalysisManager&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:5
+ORIGIN-NEXT:         #14 0x00005895cfae2865 llvm::CGSCCToFunctionPassAdaptor::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:0:38
+ORIGIN-NEXT:         #15 0x00005895ce4cad5d llvm::detail::PassModel, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:5
+ORIGIN-NEXT:         #16 0x00005895cfade813 llvm::PassManager, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:93:12
+ORIGIN-NEXT:         #17 0x00005895d1e3968d llvm::detail::PassModel, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:5
+ORIGIN-NEXT:         #18 0x00005895cfae1224 llvm::DevirtSCCRepeatedPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:0:38
+ORIGIN-NEXT:         #19 0x00005895d1e5067d llvm::detail::PassModel, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /tmp/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:5
+ORIGIN:      Errors detected for:
+
+RUN: %llvm-original-di-preservation %p/Inputs/non-existent.json --acceptance-test | FileCheck %s --check-prefix=EMPTY
+EMPTY: No errors detected for:

diff  --git a/llvm/test/tools/llvm-original-di-preservation/basic.test b/llvm/test/tools/llvm-original-di-preservation/basic.test
index 5ef670b42c667..df43fbb3b5b9f 100644
--- a/llvm/test/tools/llvm-original-di-preservation/basic.test
+++ b/llvm/test/tools/llvm-original-di-preservation/basic.test
@@ -1,17 +1,17 @@
-RUN: %llvm-original-di-preservation %p/Inputs/sample.json %t.html | FileCheck %s
+RUN: %llvm-original-di-preservation %p/Inputs/sample.json --report-html-file %t.html | FileCheck %s
 RUN: 
diff  -w %p/Inputs/expected-sample.html %t.html
 CHECK: The {{.+}}.html generated.
 CHECK-NOT: Skipped lines:
 
-RUN: %llvm-original-di-preservation %p/Inputs/corrupted.json %t2.html | FileCheck %s -check-prefix=CORRUPTED
+RUN: %llvm-original-di-preservation %p/Inputs/corrupted.json --report-html-file %t2.html | FileCheck %s -check-prefix=CORRUPTED
 RUN: 
diff  -w %p/Inputs/expected-skipped.html %t2.html
 CORRUPTED: Skipped lines: 3
 CORRUPTED: Skipped bugs: 1
 
-RUN: %llvm-original-di-preservation -compress %p/Inputs/sample.json %t3.html | FileCheck %s -check-prefix=COMPRESSED
+RUN: %llvm-original-di-preservation --reduce %p/Inputs/sample.json --report-html-file %t3.html | FileCheck %s -check-prefix=REDUCE
 RUN: 
diff  -w %p/Inputs/expected-compressed.html %t3.html
-COMPRESSED: The {{.+}}.html generated.
-COMPRESSED-NOT: Skipped lines:
+REDUCE: The {{.+}}.html generated.
+REDUCE-NOT: Skipped lines:
 
-RUN: %llvm-original-di-preservation %p/Inputs/origin.json %t4.html | FileCheck %s
+RUN: %llvm-original-di-preservation %p/Inputs/origin.json --report-html-file %t4.html | FileCheck %s
 RUN: 
diff  -w %p/Inputs/expected-origin.html %t4.html

diff  --git a/llvm/utils/llvm-original-di-preservation.py b/llvm/utils/llvm-original-di-preservation.py
index 03793b1136f8d..b5ccd7a3224f8 100755
--- a/llvm/utils/llvm-original-di-preservation.py
+++ b/llvm/utils/llvm-original-di-preservation.py
@@ -11,7 +11,6 @@
 from collections import defaultdict
 from collections import OrderedDict
 
-
 class DILocBug:
     def __init__(self, origin, action, bb_name, fn_name, instr):
         self.origin = origin
@@ -20,18 +19,35 @@ def __init__(self, origin, action, bb_name, fn_name, instr):
         self.fn_name = fn_name
         self.instr = instr
 
-    def __str__(self):
+    def key(self):
         return self.action + self.bb_name + self.fn_name + self.instr
 
+    def to_dict(self):
+        result = {
+            "instr": self.instr,
+            "fn_name": self.fn_name,
+            "bb_name": self.bb_name,
+            "action": self.action,
+        }
+        if self.origin:
+            result["origin"] = self.origin
+        return result
+
 
 class DISPBug:
     def __init__(self, action, fn_name):
         self.action = action
         self.fn_name = fn_name
 
-    def __str__(self):
+    def key(self):
         return self.action + self.fn_name
 
+    def to_dict(self):
+        return {
+            "fn_name": self.fn_name,
+            "action": self.action,
+        }
+
 
 class DIVarBug:
     def __init__(self, action, name, fn_name):
@@ -39,9 +55,41 @@ def __init__(self, action, name, fn_name):
         self.name = name
         self.fn_name = fn_name
 
-    def __str__(self):
+    def key(self):
         return self.action + self.name + self.fn_name
 
+    def to_dict(self):
+        return {
+            "fn_name": self.fn_name,
+            "name": self.name,
+            "action": self.action,
+        }
+
+
+def print_bugs_yaml(name, bugs_dict, indent=2):
+    def get_bug_line(indent_level: int, text: str, margin_mark: bool = False):
+        if margin_mark:
+            return "- ".rjust(indent_level * indent) + text
+        return " " * indent * indent_level + text
+
+    print(f"{name}:")
+    for bugs_file, bugs_pass_dict in sorted(iter(bugs_dict.items())):
+        print(get_bug_line(1, f"{bugs_file}:"))
+        for bugs_pass, bugs_list in sorted(iter(bugs_pass_dict.items())):
+            print(get_bug_line(2, f"{bugs_pass}:"))
+            for bug in bugs_list:
+                bug_dict = bug.to_dict()
+                first_line = True
+                # First item needs a '-' in the margin.
+                for key, val in sorted(iter(bug_dict.items())):
+                    if "\n" in val:
+                        # Output block text for any multiline string.
+                        print(get_bug_line(3, f"{key}: |", first_line))
+                        for line in val.splitlines():
+                            print(get_bug_line(4, line))
+                    else:
+                        print(get_bug_line(3, f"{key}: {val}", first_line))
+                    first_line = False
 
 # Report the bugs in form of html.
 def generate_html_report(
@@ -430,9 +478,16 @@ def get_json_chunk(file, start, size):
 # Parse the program arguments.
 def parse_program_args(parser):
     parser.add_argument("file_name", type=str, help="json file to process")
-    parser.add_argument("html_file", type=str, help="html file to output data")
-    parser.add_argument(
-        "-compress", action="store_true", help="create reduced html report"
+    parser.add_argument("--reduce", action="store_true", help="create reduced report")
+
+    report_type_group = parser.add_mutually_exclusive_group(required=True)
+    report_type_group.add_argument(
+        "--report-html-file", type=str, help="output HTML file for the generated report"
+    )
+    report_type_group.add_argument(
+        "--acceptance-test",
+        action="store_true",
+        help="if set, produce terminal-friendly output and return 0 iff the input file is empty or does not exist",
     )
 
     return parser.parse_args()
@@ -442,10 +497,22 @@ def Main():
     parser = argparse.ArgumentParser()
     opts = parse_program_args(parser)
 
-    if not opts.html_file.endswith(".html"):
+    if opts.report_html_file is not None and not opts.report_html_file.endswith(
+        ".html"
+    ):
         print("error: The output file must be '.html'.")
         sys.exit(1)
 
+    if opts.acceptance_test:
+        if os.path.isdir(opts.file_name):
+            print(f"error: Directory passed as input file: '{opts.file_name}'")
+            sys.exit(1)
+        if not os.path.exists(opts.file_name):
+            # We treat an empty input file as a success, as debugify will generate an output file iff any errors are
+            # found, meaning we expect 0 errors to mean that the expected file does not exist.
+            print(f"No errors detected for: {opts.file_name}")
+            sys.exit(0)
+
     # Use the defaultdict in order to make multidim dicts.
     di_location_bugs = defaultdict(lambda: defaultdict(list))
     di_subprogram_bugs = defaultdict(lambda: defaultdict(list))
@@ -489,9 +556,9 @@ def Main():
                 skipped_lines += 1
                 continue
 
-            di_loc_bugs = di_location_bugs[bugs_file][bugs_pass]
-            di_sp_bugs = di_subprogram_bugs[bugs_file][bugs_pass]
-            di_var_bugs = di_variable_bugs[bugs_file][bugs_pass]
+            di_loc_bugs = di_location_bugs.get("bugs_file", {}).get("bugs_pass", [])
+            di_sp_bugs = di_subprogram_bugs.get("bugs_file", {}).get("bugs_pass", [])
+            di_var_bugs = di_variable_bugs.get("bugs_file", {}).get("bugs_pass", [])
 
             # Omit duplicated bugs.
             di_loc_set = set()
@@ -515,9 +582,9 @@ def Main():
                         skipped_bugs += 1
                         continue
                     di_loc_bug = DILocBug(origin, action, bb_name, fn_name, instr)
-                    if not str(di_loc_bug) in di_loc_set:
-                        di_loc_set.add(str(di_loc_bug))
-                        if opts.compress:
+                    if not di_loc_bug.key() in di_loc_set:
+                        di_loc_set.add(di_loc_bug.key())
+                        if opts.reduce:
                             pass_instr = bugs_pass + instr
                             if not pass_instr in di_loc_pass_instr_set:
                                 di_loc_pass_instr_set.add(pass_instr)
@@ -538,9 +605,9 @@ def Main():
                         skipped_bugs += 1
                         continue
                     di_sp_bug = DISPBug(action, name)
-                    if not str(di_sp_bug) in di_sp_set:
-                        di_sp_set.add(str(di_sp_bug))
-                        if opts.compress:
+                    if not di_sp_bug.key() in di_sp_set:
+                        di_sp_set.add(di_sp_bug.key())
+                        if opts.reduce:
                             pass_fn = bugs_pass + name
                             if not pass_fn in di_sp_pass_fn_set:
                                 di_sp_pass_fn_set.add(pass_fn)
@@ -562,9 +629,9 @@ def Main():
                         skipped_bugs += 1
                         continue
                     di_var_bug = DIVarBug(action, name, fn_name)
-                    if not str(di_var_bug) in di_var_set:
-                        di_var_set.add(str(di_var_bug))
-                        if opts.compress:
+                    if not di_var_bug.key() in di_var_set:
+                        di_var_set.add(di_var_bug.key())
+                        if opts.reduce:
                             pass_var = bugs_pass + name
                             if not pass_var in di_var_pass_var_set:
                                 di_var_pass_var_set.add(pass_var)
@@ -582,19 +649,40 @@ def Main():
                     skipped_bugs += 1
                     continue
 
-            di_location_bugs[bugs_file][bugs_pass] = di_loc_bugs
-            di_subprogram_bugs[bugs_file][bugs_pass] = di_sp_bugs
-            di_variable_bugs[bugs_file][bugs_pass] = di_var_bugs
-
-    generate_html_report(
-        di_location_bugs,
-        di_subprogram_bugs,
-        di_variable_bugs,
-        di_location_bugs_summary,
-        di_sp_bugs_summary,
-        di_var_bugs_summary,
-        opts.html_file,
-    )
+            if di_loc_bugs:
+                di_location_bugs[bugs_file][bugs_pass] = di_loc_bugs
+            if di_sp_bugs:
+                di_subprogram_bugs[bugs_file][bugs_pass] = di_sp_bugs
+            if di_var_bugs:
+                di_variable_bugs[bugs_file][bugs_pass] = di_var_bugs
+
+    if opts.report_html_file is not None:
+        generate_html_report(
+            di_location_bugs,
+            di_subprogram_bugs,
+            di_variable_bugs,
+            di_location_bugs_summary,
+            di_sp_bugs_summary,
+            di_var_bugs_summary,
+            opts.report_html_file,
+        )
+    else:
+        # Pretty(ish) print the detected bugs, but check if any exist first so that we don't print an empty dict.
+        if di_location_bugs:
+            print_bugs_yaml("DILocation Bugs", di_location_bugs)
+        if di_subprogram_bugs:
+            print_bugs_yaml("DISubprogram Bugs", di_subprogram_bugs)
+        if di_variable_bugs:
+            print_bugs_yaml("DIVariable Bugs", di_variable_bugs)
+
+    if opts.acceptance_test:
+        if any((di_location_bugs, di_subprogram_bugs, di_variable_bugs)):
+            # Add a newline gap after printing at least one error.
+            print()
+            print(f"Errors detected for: {opts.file_name}")
+            sys.exit(1)
+        else:
+            print(f"No errors detected for: {opts.file_name}")
 
     if skipped_lines > 0:
         print("Skipped lines: " + str(skipped_lines))