[clang] [DebugInfo] Fix duplicate DIFile when main file is preprocessed (PR #75022)

Fangrui Song via cfe-commits cfe-commits at lists.llvm.org
Sun Dec 10 19:55:15 PST 2023


https://github.com/MaskRay created https://github.com/llvm/llvm-project/pull/75022

When the main file is preprocessed and we change `MainFileName` to the
original source file name (e.g. `a.i => a.c`), the source manager does
not contain `a.c`, but we incorrectly associate the DIFile(a.c) with
md5(a.i). This causes CGDebugInfo::emitFunctionStart to create a
duplicate DIFile and leads to a spurious "inconsistent use of MD5
checksums" warning.

```
% cat a.c
void f() {}
% clang -c -g a.c  # no warning
% clang -E a.c -o a.i && clang -g -S a.i && clang -g -c a.s
a.s:9:2: warning: inconsistent use of MD5 checksums
        .file   1 "a.c"
        ^
% grep DIFile a.ll
!1 = !DIFile(filename: "a.c", directory: "/tmp/c", checksumkind: CSK_MD5, checksum: "c5b2e246df7d5f53e176b097a0641c3d")
!11 = !DIFile(filename: "a.c", directory: "/tmp/c")
% grep 'file.*a.c' a.s
        .file   "a.c"
        .file   0 "/tmp/c" "a.c" md5 0x2d14ea70fee15102033eb8d899914cce
        .file   1 "a.c"
```

Fix #56378 by disassociating md5(a.i) with a.c.


>From ace1e56b3db30466748dc7d8c5ba10ebf4001bd5 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Sun, 10 Dec 2023 19:18:36 -0800
Subject: [PATCH] [DebugInfo] Fix duplicate DIFile when main file is
 preprocessed

When the main file is preprocessed and we change `MainFileName` to the
original source file name (e.g. `a.i => a.c`), the source manager does
not contain `a.c`, but we incorrectly associate the DIFile(a.c) with
md5(a.i). This causes CGDebugInfo::emitFunctionStart to create a
duplicate DIFile and leads to a spurious "inconsistent use of MD5
checksums" warning.

```
% cat a.c
void f() {}
% clang -c -g a.c  # no warning
% clang -E a.c -o a.i && clang -g -S a.i && clang -g -c a.s
a.s:9:2: warning: inconsistent use of MD5 checksums
        .file   1 "a.c"
        ^
% grep DIFile a.ll
!1 = !DIFile(filename: "a.c", directory: "/tmp/c", checksumkind: CSK_MD5, checksum: "c5b2e246df7d5f53e176b097a0641c3d")
!11 = !DIFile(filename: "a.c", directory: "/tmp/c")
% grep 'file.*a.c' a.s
        .file   "a.c"
        .file   0 "/tmp/c" "a.c" md5 0x2d14ea70fee15102033eb8d899914cce
        .file   1 "a.c"
```

Fix #56378 by disassociating md5(a.i) with a.c.
---
 clang/lib/CodeGen/CGDebugInfo.cpp                 | 10 ++++++----
 clang/test/CodeGen/debug-info-preprocessed-file.i |  4 ++++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp b/clang/lib/CodeGen/CGDebugInfo.cpp
index 7cf661994a29c..37d7a6755d390 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -554,14 +554,16 @@ void CGDebugInfo::CreateCompileUnit() {
     // If the main file name provided is identical to the input file name, and
     // if the input file is a preprocessed source, use the module name for
     // debug info. The module name comes from the name specified in the first
-    // linemarker if the input is a preprocessed source.
+    // linemarker if the input is a preprocessed source. In this case we don't
+    // know the content to compute a checksum.
     if (MainFile->getName() == MainFileName &&
         FrontendOptions::getInputKindForExtension(
             MainFile->getName().rsplit('.').second)
-            .isPreprocessed())
+            .isPreprocessed()) {
       MainFileName = CGM.getModule().getName().str();
-
-    CSKind = computeChecksum(SM.getMainFileID(), Checksum);
+    } else {
+      CSKind = computeChecksum(SM.getMainFileID(), Checksum);
+    }
   }
 
   llvm::dwarf::SourceLanguage LangTag;
diff --git a/clang/test/CodeGen/debug-info-preprocessed-file.i b/clang/test/CodeGen/debug-info-preprocessed-file.i
index 23fd26525e3bf..c8a2307d46c31 100644
--- a/clang/test/CodeGen/debug-info-preprocessed-file.i
+++ b/clang/test/CodeGen/debug-info-preprocessed-file.i
@@ -6,6 +6,10 @@
 # 1 "<built-in>" 2
 # 1 "preprocessed-input.c" 2
 
+/// The main file is preprocessed. We change it to preprocessed-input.c. Since
+/// the content is not available, we don't compute a checksum.
 // RUN: %clang -g -c -S -emit-llvm -o - %s | FileCheck %s
 // CHECK: !DICompileUnit(language: DW_LANG_C{{.*}}, file: ![[FILE:[0-9]+]]
 // CHECK: ![[FILE]] = !DIFile(filename: "/foo/bar/preprocessed-input.c"
+// CHECK-NOT: checksumkind:
+// CHECK-NOT: !DIFile(



More information about the cfe-commits mailing list