[Lldb-commits] [lldb] [lldb][Formatters] Do not recursively dereference pointer type when creating formatter candicates list. (PR #124048)

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Fri Jan 24 02:44:58 PST 2025


labath wrote:

> @labath What do you think? The solution above will still use the formatters for `T` when `T**` is printed.

Before I answer that, I want to go back to the question of dereferencing. While I don't think that skipping of arbitrary numbers of points is *wrong* (it's consistent and easy to explain), I also don't find it *useful*. One level -- sure, Even though I don't think it should be the default, I can definitely see reasons for why one would want to see the contents of `T` when one has a variable of type `T*`. Reason: many codebases -- lldb included -- use `T*` for function "return" types or for optional arguments, so if you have a `T*`, it's quite likely you're interested in seeing the `T` object rather than the pointer itself.

I don't think this argument holds for more than one level of pointers. If I have a `T**` or `T***` (I've seen those in the wild), what are the chances I want to see the `T` object at the end of the chain? I think they're very low. I think it's more likely, I'm looking at either:
- some generic code which works with `U*`, where `U` happens to be a `T*`. In this case I'm more likely to want to see the pointer values that the final object.
- or some C-like representation of an array (argc, argv), in which case formatter will just tell me something about the first element of the array (but maybe I want to see some other element, or information about the array itself)

Even if the users really does want to see the final object, they can still do that even if we don't dereference all the levels for them. It just means they need to do that themselves.

As for how does this fit in with documented behavior, I don't see much of a problem with that. It's true that you could interpret `Don't use this format for pointers-to-type objects.` as skipping any level of pointers, but I think you could also interpret it as skipping just one. It says "pointers to **type**", not "arbitrary level of pointers" or anything like that. For comparison, the `cascade` option (which skips arbitrary levels of typedefs) says this (emphasis mine) "If true, cascade through typedef ***chains***.", which is a lot more explicit. It also makes more sense because typedefs are often chained and the language (the C family at least)  treats typedefs (but not pointers) transparently.

In addition to that, the `skip-references` option uses the exact same language as for pointers, but it skips only one level of references (although that's because there are no reference chains at the language level, not because we chose to implement it that way). So, even though you can rightfully accuse me of retconning this, I think the result would be fine.

Another factor, which I think supports the fact that skipping all levels isn't the right default is that ~all of our pretty printers are currently broken for multiple pointer levels:
```
(std::map<int, int> *) pm = 0x00007fffffffd798 size=1
(std::optional<int> *) po =  Has Value=true 
(std::tuple<int, int> *) pt = 0x00007fffffffd780 size=2
(std::vector<int> *) pv = 0x00007fffffffd760 size=2
(std::map<int, int> **) ppm = 0x00007fffffffd758 size=0   # points to pm
(std::optional<int> **) ppo =  Has Value=false    # points to po
(std::tuple<int, int> **) ppt = 0x00007fffffffd748 size=0  # points to pt
(std::vector<int> **) ppv = 0x00007fffffffd740 size=1   # points to pv
```

If this was one incident, you could dismiss it as a buggy formatter, but since it affects all of them, I think this points to a deeper problem. We can fix all of our formatters to dereference all pointers, but that won't change all the pretty printers out there, and I'm sure that most of them have the same bug. We could try to fix all of them by dereferencing the value before handing it off to the formatter, but I don't think that's useful. The fact that we didn't notice this until now tells me users don't have that many double pointers floating around, and so they likely will not complain if we stop printing this output (which was incorrect anyway).

TL:DR: I think it would be better to skip just one level of pointers. If someone really wants to format an arbitrary number of pointers, they can always use a regex to register a pretty printer for `T (\*)*`

https://github.com/llvm/llvm-project/pull/124048


More information about the lldb-commits mailing list