<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/119020>119020</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [MC] Compiler produces incorrect DWARF file/line table on some assembly files
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          noxwell
      </td>
    </tr>
</table>

<pre>
    #### Repro:

```
echo -e ".section .data\n.asciz \"Hello, World\"" > foo.S
clang -gdwarf-5 -c -o foo.o foo.S
readelf -wl foo.o
```

#### Expected output:
```
The File Name Table (offset 0x2c, lines 1, columns 2):
  Entry Dir     Name
 0      0       (indirect line string, offset: 0x34): foo.S
```

#### Actual output:
```
The File Name Table (offset 0x2e, lines 1, columns 3):
 Entry  Dir     MD5                             Name
  0     0 0x83694a528d2e6a4a6d56c2a58846bf03    (indirect line string, offset: 0x34): /tmp/foo.S
```
Notice MD5 that doesn't match `foo.S` checksum and `/tmp/` prefix.

#### What is happening:

Clang compiles `.S` files in two stages: first is `cc1` in `-E` mode to preprocess `.S` into `.s` file in `/tmp` folder. Then `.s` file is compiled to an object file with `cc1as`. Those two stages are run independently and `cc1as` has access only to a temporary file, thus it initially fills DWARF line table with `/tmp/foo.S` and hashes this temporary file. Then when the file is parsed, `AsmParser` extracts original file and line information from preprocessor output, see `AsmParser::parseCppHashLineFilenameComment` and `FirstCppHashFilename`. And then before any DWARF information is emitted, `FirstCppHashFilename` is sent to `setMCLineTableRootFile`, see `AsmParser::enabledGenDwarfForAssembly`.

This logic looks good so far, but what if assembly doesn't have anything that would emit DWARF information? Sure, `.debug_info` will be empty, but `.debug_line` is still emitted! And because `AsmParser::enabledGenDwarfForAssembly` was never called, `FirstCppHashFilename` never reaches `MCLineTable.RootFile`.

#### Why this is a problem:

First of all, temporary file in DWARF file table is useless and disorienting.

But we have a bigger problem with reproducible builds: if such assembly files were compiled in a different directories, they would produce different binary objects:

```
mkdir test1 test2
: > test1/foo.S
: > test2/foo.S

prebuilts/clang/host/linux-x86/clang-r536225/bin/clang -ffile-prefix-map=$(pwd)= -gdwarf-5 -c -o test1/foo.o test1/foo.S
prebuilts/clang/host/linux-x86/clang-r536225/bin/clang -ffile-prefix-map=$(pwd)= -gdwarf-5 -c -o test2/foo.o test2/foo.S

readelf -wl test*/foo.o
```

This produces the following result:
```
File: test1/foo.o
  0     0 0x66103f858ee572019beb8bddfb603a0b    (indirect line string, offset: 0x1): /tmp/foo.S
File: test2/foo.o
  0     0 0xe51d4e292176a919dc9f8c483f46c442    (indirect line string, offset: 0x1): /tmp/foo.S
```

This is not the case for C files, they are stable:
```
echo "void foo(void) {}" > test1/foo.c
echo "void foo(void) {}" > test2/foo.c

clang -ffile-prefix-map=$(pwd)= -gdwarf-5 -c -o test1/foo.o test1/foo.c
clang -ffile-prefix-map=$(pwd)= -gdwarf-5 -c -o test2/foo.o test2/foo.c
```
```
readelf -wl test*/foo.o
  0     0 0x92f1d07bfb2675bf4d40f7c48b8e8295    (indirect line string, offset: 0x1): test1/foo.c
  0     0 0x92f1d07bfb2675bf4d40f7c48b8e8295    (indirect line string, offset: 0x1): test2/foo.c
```

And we expect the same from assembly compiler.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEV0uP47gR_jX0pWBDoh4tHXzw2OPsIbMIZgeYY0CRJYsZihRIqm3n1wekLFv92mQ2GwRQuyWyWFVfvcmckyeNuCXFJ1IcVmz0nbFbbS5nVGrVGHHdEprdH_iKgzUk25EkPmVye5Id8s7AGoFQunHIvTQaNoJ5Roq93jDH5T-BFHtC6S-olCF0D9-NVWJaI5QCyT5Da8zmN5LsuGL6BOuTODPbrgtYc1ibuGvuNBaZQNXC-qymnVcahWep--fLgNyjADP6YfQ3FIsD3zqEo1QIv7Ie4RtrVIBTmbZ16CG5UB60VlKjgzS8cqPGXjughNYTO4DP2tsrSeqDtCSpA6ewnJCkDn-EVlILaZH7yAict1KfArNJDMl2kFyyfOJ4x_p7uHbcj0z9cVT4AarsgeolqC-HIkCZnxnjBBKSS5WVdc4KWgmKJctZKYqSU1ZUVV42bZL9ATsQevT9QOjxPYP8arzkCF8OBfiOeRAGnSb0yUPPPO-AlMl0rEyAd8h_uLEHpkXYuDMOm4PFVl4279j4e-ArHXRsGFAHTecc2MdI5aYfpEIXWE6C2vgpNfizAefZCV10qLQuciJlwnkaKKUOH-vP4b03AsGboMlgDUe34Ci1N_HLzfxvR28YwqJRAu0GvnWoX5O6WUkRBDANpvlHMH_cPUvf3VRi4UxgYRwulAdmEeyoQWqBA2qB2qvrbMb5HHTMAeNRcaPVNYoCj_1gLLPXKCw42XejA-lBauklUyruKAeH77uvxykifIzUWbFXAVAmUXLHXIcOfCfdKyE3G5zDj-_wboOBWYciqEDKZOf6v4VvG_jhxVvGvQNj5UlqpqYzQUzUR-rW2J7FwtZa0y98ZOycfXQPDvEl72xHsl2Uux-GX5jr_io1hozUrMe96XvUfgZEyuQYIuRGOVNFj-y0CFA0NNgaGzS73uy1VE06wF56fwf5Ab9A6FB7mILKof-yD4rFAvHVGB9oo-E_goQ6kIq_oD6EKn00ducc9o26BnWn7PgWPKPMSXJQxvxwcDJGgDPQMhs4N6OHc8ytFtjt9CJ_O_YccfpO6tOU3GczKhEhvgVPsiP8Nlq8Id8IbMbT3wNBAHyWSkGDgP3gr7PwB1lw8mwXH0jvZkyj6RvkbHQ_bQg4Mwcan9ECZ0r9W7dMpBYZ76ZysnDLZuGX96vUdcoF6YDBYE2jsL9XqigRTAtMqZiDLxIm1JLJoPFryj7pYHSoQjaH6BTSGStRe6lPN_mfggPx5iho5OmEdpY8JW_MEjFyGRg2o1QiFkLZght59_D6VDDPaPFRqKQGBkK2LdoQqlPDCCq4qYjg9RYPkwhc0DZSB2hTkXPvjiz9DyEteHQ-jb807Idmk32eVpf9ZrFOl-vJbrAYYHlH6DHOLYQeO-M8oUcl9XhZX6py3lrbIispLQg9NlLPq7BuA_r11H_WPRtIdiA0J7QaziK2wMObYWipoXmj7_9RKfpCqVfGWg5tYZvQ3Uz-dsyJ1ePmWjeVcaOUOYdiYNGN6p1xJ6ZHtntpnhfjSVmmSdZWRYVYPNEkrRtsqkaItimTjCXNfz6epB9NJ0sl6LtKYJGKHGlN06eS1WkteN1WPK-yNi95ntM_QYn3jCkdaOOjKTlzwZ4W9lPq3TMqNHoX8_-tdeOUTyh9NlKE4ZTQKrwSWgN5-kSeDvMgvzQ__9mDdHHwcRn4U3OE_9d83w9z_trwi_ffj_1HbNS0TUXy1LQNLZ-Kps1FnrRPPK-aCitaFz8fG6-R_69lfWiNZBda6RkB42UsBqIL95I4Ud17wa0B2M1KbDNRZzVb4TZ9yrIiLYo8X3VbTJs8q4oqy1lVs5oyhnVV0ILVvE2KNF3JLU1ontKkpElS5NmmFjUKUYkqa-taJCXJE-yZVBulnvuNsaeVdG7EbZrWCU1WijWoXLwYU6rxDHE3XFWLw8puw6F1M54cyRMlnXcPNl56FW_UX_akOMD-huVRyKTmxkZzPhruVJfnvms0ONPjq-a4Gq3adt4PsZ3RI6HHk_Td2Gy46QMD9Tz_Ww_WhMZH6DHqHdrADdjzlv4rAAD__-sv838">