[Lldb-commits] [PATCH] D129307: Add a new breakpoint partial match settings

Mon Jul 11 18:40:08 PDT 2022

jingham added a comment.

In D129307#3643426 <https://reviews.llvm.org/D129307#3643426>, @yinghuitan wrote:

> @jingham, thanks for sharing the thoughts.

First off, I am supportive of your project, this is a pain point for some folks for sure.  I am in favor of trying to automate "I built my binaries with sources in one location but locally they are in another" scenario.

Having breakpoint misses be the point where this gets checked is not a great final design, since there are other places where lldb needs to find source files.  But if it does the 90% job, it's okay to start there.

OTOH, framing the feature as "you pass me a path, and I'm only going to check the last N path components regardless of how many you entered" is just weird.  I'd rather we not expose that.  That choice, after all, is trivial in the IDE, just send the parts of the path you care about.  I think that one is fine to leave on the IDE side if you want it explicitly, since it would be trivial for an IDE to implement.

I also don't like the original framing as it mixes the problem you are actually trying to solve: "match the full paths from IDE project info with those in debug info" with the general problem of breakpoint setting.  For instance, while I might very well want you to try to match up full paths from a project this way, if I type:

  (lldb) break set -f foo/bar.c -l 10

I never want you to set a breakpoint on `baz/bar.c`.  That doesn't make any sense.  So I don't think we should make that possible.

What the user cares about is that you are going to auto-deduce out-of-place source file mappings for them.  That's the piece that really matters; having users have to turn this on by specifying the parameters for finding inexact matches seems backwards.  So the patch would make a lot more sense to me if you were adding `target.auto-deduce-source-maps`, and doing your work behind that setting however you need to.

For that feature, it would be clear that this is only to be done when the input path is a full path, since you can't set up a path mapping if the "real file location" isn't fully specified.  That will limit the scope for accidental matchings and keep it out of all the CLI entered breakpoint setting (except for people that enter full paths, presumably to trigger this auto-deduction since I've actually never seen somebody do that IRL...)  And you can use whatever algorithm to search for potential matching patterns makes sense for the job you are actually trying to do see if there's a rigid translation between a source tree the IDE knows about, and one from Debug Information.

>> Setting the breakpoint search to only check base names will of course make all your file and line breakpoints work, but they will also pick up extra hits. In command line lldb that doesn't look nearly as weird as clicking on one source file window & seeing a breakpoint marker show up in a completely different window. If we didn't care about that problem and expected people to manage these extra locations by hand, then indeed just saying "only set using base-names" is fine. But it still seems weird to me that we would add an lldb setting and code to support that rather than have IDE's just only pass in the base name if that's their strategy.
>
> One thought to mitigate this is, for all the symbol context(s) matched with base file name, we could iterate and find if there is any exact match with request breakpoint path. We would prefer this exact match symbol context and throw away other partial matches. However, if no exact match exists, we have to keep all of them to not miss breakpoints because there is no way to know which one users want. This should make 90% (my guess) breakpoint cases (with exact matching) the same behavior as before.
>
>> I really don't want changing a setting to add or remove locations from an extant breakpoint. That's not how breakpoints work. If I go to the trouble of inputting a command on a location, and the I change a setting and the location goes away, I would rightly be ticked off. So I really think the locations should record whatever search they are going to do when they are created, and stick to it. However, I have no problem with having the breakpoint store "original path/matched path" pairs if that makes it easier to figure out what is going on.
>
> I do not not have much preference on this one. In 99% of the use cases, client/IDE/users would set this setting during startup without changing during the lifetime of debug sessions.
>
>> The behind your back creation of source maps gives me the willies. Particularly if you have more than one module with debug information, all built originally in the phony locations so they all look like the original directories have the same root, it seems like you could start getting source maps that map everything to everything and after a while this will get hard to reason about. Maybe I'm being pessimistic, however, and if you are excited to try, more power to you.
>
> Ideally, if compiler/linker can emit checksum for each source file into debug info we can verify each matched source file to filter noise. I know Microsoft toolchain does so but seems like llvm does not?
>
>> But I don't think you can drive this from breakpoints alone. Suppose I set no breakpoints, and the program crashes at /OriginalDirectory/SourceRoot/MyProject/subdir/foo.c, lldb is still going to have to figure out what to show the user, so you're still going to have to come up with the real source file. And you can't use your breakpoint tricks to help you because it's lldb providing the information and at the point where you are doing this it only knows the original location. The IDE is the only agent that knows where to look for other possible locations.
>
> I agree that breakpoint auto source mapping only helps if users/IDE set file line breakpoint. And I have spent some thoughts on the non-breakpoint cases. Ideally, we want to have a `target.source-paths` setting which users/IDE/CLI can tell lldb where to look for source files. IDE can even pop-up a dialog ask user to browse the target source file for selected frame without valid source files. I know both windbg and visual studio debugger are providing this option. I would like to improve this part but currently unverified breakpoint due to incorrect/missing source map settings are #1 pain points from our users in Meta.
>
>> It seems like it would be cleaner to have some kind of callback when a binary with debug information gets loaded where the UI could have a look at the CU's in the module that got loaded, and see if it knows about any of those files locally (because it has them in in directory-equivalent places in its project). The UI can then construct a source map from there. That way this would happen predictably and for all consumers, rather than relying on the user's path through breakpoint setting to set the source mapping world back to rights.
>
> We can look into this to further improve the auto source mapping feature. My concern is that it requires relative complex interaction from IDE/client which increases the barrier for wide adoption.
>
> From high level, I agree that only users/IDE know the true source physical locations. Actually, the design of "breakpoint guided auto source map" is using this truth information from IDE - breakpoint file path to guide auto source mapping. I agree that it is not completely fixing all source mapping situations like the non-breakpoint cases as you said. It seems natural to improve incrementally. Based on our user study, breakpoint guided auto source mapping is one of the most important first step.
>
> Regarding implementing in IDE vs lldb, we could implement the logic in lldb-vscode but:
>
> - That would not benefit other IDE(s).
> - Many IDE(s) are hard to customize. They are providing a general debugging protocol shared by many language engines.
> - It would not help command line cases (I bet some command line lldb users would try to use full path to set breakpoint. I wish there is telemetry for it).
>
> My goal with this and follow-up patches is trying to make lldb breakpoint/source-mapping working by default (command line or IDE) without caring about source map settings as much as possible. That would make lldb easier to use/adopt. Happy to work out a solution to balance all the concerns.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129307/new/

https://reviews.llvm.org/D129307