[PATCH] D155991: [DWARF] Make sure file entry for artificial functions has an MD5 checksum

Fri Jul 28 10:13:55 PDT 2023

probinson added a comment.

> Any memory usage measurements to check this doesn't have a significant adverse impact by copying all the strings?

Not actual measurements, no; but intuitively the size should not be much greater than the size of the filename entries in the .debug_line section for the CU, which should be KB not MB. (Plus StringMap entry overhead, which is constant per node.) Therefore I didn't take the time to measure. But see below for a possible alternate approach which would avoid even that much.

> Could/should we do the lookup on the CU filename before it goes into the DI metadata, and store that FileID somewhere for later use?

There's a note in issue 63955 to the effect that I can't find an API to turn a filename into a FileID. If there is one that I didn't find, we could use FileIDs instead of pointers to name strings.

> FYI this is a 0.5% compile-time regression on `O0` builds (https://llvm-compile-time-tracker.com/compare.php?from=69593aa5c054cec6be6b822c073ccdc63748a68d&to=7abb5fc618cec66841a8280d2a099a4c9c8cb91b&stat=instructions:u). Is that expected?

That's higher than I expected.

It might be feasible to do this a different way: When an invalid loc comes in, search the DIFileCache for a matching string, and use that instead of using the CU's copy of the filename. Then we can go back to using pointers as the keys. Might eliminate the duplicate DIFile as well. I'll look into that. Then the time cost would be incurred only when invalid locs come in, which is a minority of lookups (depends on the number of artificial functions generated).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155991/new/

https://reviews.llvm.org/D155991