[Lldb-commits] [lldb] [lldb] Add more ways to find the .dwp file. (PR #81067)

Wed Feb 14 12:18:34 PST 2024

dwblaikie wrote:

> > > If the client strips the debug info first into "a.out.debug" and then runs llvm-dwp, they will end up with a "a.out.debug.dwp". We have clients that are doing this already and we want to support them.
> > 
> > 
> > OK, could we fix llvm-dwp to match the behavior, then? If the file has a .debug extension, strip that and add the .dwp extension.
> 
> Here people are not using ".debug", but are using ".debuginfo"... 

They could only use that for symbolizing, yeah? They wouldn't be able to debug their binary, because a debugger wouldn't know that, given the stripped binary, they need to append `.debuginfo` to find the debug info, right? (but I think debuggers do currently have the ability to add ".debug" to find the debug info) - or maybe it doesn't? (I guess reading https://sourceware.org/gdb/current/onlinedocs/gdb.html/Separate-Debug-Files.html - it's never a simple mapping from `binary` to `binary.debug` - either it uses debuglink, in which case the filename is encoded in the debuglink and could be anything, or it's buildID, in which case it is `binary.debug`, but in a buildID-named directory)

> Again, nothing is enforced and people are left to use llvm-objcopy + llvm-dwp how ever they want. Getting a solution that does everything might be nice. Any thoughts on modifying llvm-dwp to be able to do all of this and provide some path for people where it can either just create a .dwp file for a given executable _or_ it can create a `.debug` file, strip the original file, and create a `.dwp` file?

I don't know that llvm-dwp needs to do everything, that's a bit against the grain of *nix tool design. But certainly llvm-objcopy could be more ergonomic (like a one-shot, that both strips the debug info from the binary, and produces the keep-only-debug in `binary.debug` - an even smaller feature would be, like `llvm-dwp` has a default `-o` of `binary.dwp`, `llvm-objcopy` could have a default output file when using `--only-keep-debug` of `binary.debug`, and `llvm-dwp` could grow a special case for `if the file ends in .debug, strip that before adding .dwp` (maybe more generally, strip any `.word` file extension before adding `.dwp` though that might be overly aggressive on *nix systems where users don't expect `.suffix` to be special cased at all)

So I guess that'd be the smallest two features I'd suggest starting with
1) llvm-objcopy (& ideally, if someone wanetd to write the patch, binutils objcopy) defaulting to `binary.debug` as the output file when using `--only-keep-debug`
2) llvm-dwp stripping the `.debug` suffix, if it appears, before adding `.dwp` (this one's a bit tricky/I could imagine some disagreement, because someone might`ve just named their real binary "blah.debug" but I'm not sure there's much else to do in that case if we're going to support doing things with debug info with the `--only-keep-debug` file, and with the original unstripped binary, and with the stripped binary+`--only-keep-debug`, etc... have to have some basename to derive everything from)

A wrapper script or program (maybe it could be built into llvm-objcopy, but I'm a little hesitant there, but would be open to other folks opinion on it for sure) that ingests the binary and produces the 3 products (stripped binary, binary.debug, binary.dwp) seems easy enough to provide.

> > > The compiler and linker drivers are staying out of this and we expect people to do this on their own, so this is what we end up with when there is no enforcement.
> > 
> > 
> > They aren't doing it on their own though - they're using llvm-dwp and its defaults (they're passing it a .debug file and getting a .debug.dwp file - it's the defaults you/we are worried about, and how to make other tools work well with those defaults). We can change those defaults if they don't work well/don't create a consistent environment.
> > > I am not sure why this is such a sticking point. Lets make the debugger work for people.
> > 
> > 
> > As I explained above - my concern is that supporting a wider variety of ways these files can be named/arranged means more variants that need to be supported across a variety of tooling (symbolizers and debuggers - not just LLVM's but binutils, etc too).
> > But that's my 2c - if LLDB owners prefer this direction, so be it. Wouldn't mind hearing some other people's perspectives on the issues around limiting variation here.
> 
> I am happy to hear any other opinions as well. I tend to want to make my life easier and ease the support burden I run into everyday where people that know nothing about split DWARF are trying to use it and failing and require tech support to make it work for them. I am happy to suggest a path to follow, in fact I am going to write up the best practices on a DWARF group here at work that I can point poeple to.

Yeah, I do want people to not have problems here/for things to work - I worry that making too many different things work is good short term (yay, many users are happy without having to change anything), but harmful long term when there's a wider variety of ways people do things and then new tools have to learn all those ways (& probably don't learn them all in one go - so we end up with different tools implementing subsets of all possible lookup rules, which is more confusing/problematic for users).

https://github.com/llvm/llvm-project/pull/81067