[llvm] [dsymutil] Add option to filter debug map objects by allowlist (PR #182083)

Greg Clayton via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 18 12:09:08 PST 2026


clayborg wrote:


> I wanted to look at the test file to understand the format of this file. I would have assume it's a YAML file, like the debug map, but this seems so suggest otherwise. It's also not clear to me, based on this, if this is about filtering out objects or symbols. The description says symbols, but this and the name of the option suggest objects?
> 
> > * `--oso-prepend-path` does not apply to the paths in the allowlist.

It is meant to include full N_OSO entries based on a given path. Only N_OSO entries whose path matches one of the paths in the list would be included in the output dSYM file. This way we don't need to mention any symbols, just which N_OSO entries we want to include. 

> Why?

So the issue is for very large projects (like WhatsApp, Instagram and Facebook), if we tell llvm-dsymutil to produce a YAML debug map file, it outputs a very large file with many GBs of text since it includes the .o file paths and each symbol and its name. The binaries are huge to begin with and if we produce debug info for the entire app debugging can slow down. Meta has ways to focus on certain targets where we want to debug only some parts of the application, and not others. These parts are part of a large binary and breaking them up into shared libraries introduces runtime performance issues due to shared library resolving and the dynamic loader. We also have a caching build system so everything is cached with full debug info, so any file that hasn't been modified, we have a cached .o file for and don't need to compile, we just get it from our build system cache. We are looking to find quick ways to produce a minimal dSYM file. 

Previous approaches we did:
- scribble the path to the .o files in the debug map of the main executable so they have invalid paths to avoid linking any info from the .o file. dsymutil skips any .o files that it can't open, so we used that was a way to reduce the debug info in the dSYM. The issue here is the binary gets modified and then needs to be re-codesigned after such a modification. But running dsymutil on this binary will produce a minimal dsym as any .o file paths that have been changed will not be found and processed. 
- Use the YAML debug map and edit the yaml. This works, but it is quite slow as creatings and modify a very large YAML file is time consuming. The YAML is also very large due to all of the . o file and symbol names. So we spent almost as much time modifying the YAML and trying to use it that it wasn't a win compared to just scribbling over the .o file paths from the first approach and re-code signing.

For our focus, we can think about this from a cmake + ninja perspective where we could focus on only certain cmake targets, and then only generate debug info for the targets we want. Many cmake targest in llvm/lldb are static libraries and many binaries are created from many different static libraries. We can turn on and off debug info for each target, even if that target is not an actual shared library or executable. We can determine the .o files that contribute to each cmake target, and then try to make a dSYM that contains only what we are focused on.

So one idea is to change the YAML format to not require all of the N_FUN, N_GSYM and N_STSYM entries inside of it and if none are specified then it uses the debug map. But that seemed to ruin the reason for the YAML to exist where you can modify individual entries. So we came up with the idea that says "just process these .o files, but use the actual debug map's N_FUN, N_GSYM and N_STSYM entries". Then we just need to produce a file with a single .o file on each line and add the new option to dsymutil. The benefits are:
- we don't have to produce a huge YAML file and then trim it down because the file is still huge and costly to produce and reduce
- it maps well to our approach where we produce full debug info in a cached build system for each .o file, but we can quickly change focus by creating a new .o file list and run dsymutil again to include or exclude debug information. Previous focus changes required us to either modify the binary by scribbling out .o file paths, or produce a full sized yaml file again and then reduce it appropriately and then run dsymutil. Just making a simple list of .o files is the fastest approach we came up with after some testing. This allows us to quickly change our debug info focus and produce a new dSYM file that contains only what we are focused on.




https://github.com/llvm/llvm-project/pull/182083


More information about the llvm-commits mailing list