[lldb-dev] Help with C++ API

Wed Aug 28 14:48:58 PDT 2013

On Aug 28, 2013, at 2:31 PM, Jakob Leben <jakob.leben at gmail.com> wrote:

> On Wed, Aug 28, 2013 at 2:15 PM, Enrico Granata <egranata at apple.com> wrote:
>> I think such a visualization would be highly inconvenient for most users
>> most of the time, much as you do not want to see a char* so much as an array
>> of chars but rather as a logical “string”, the same is true of std::string
>> Also, if you know the layout of the string class, you can directly access
>> the data buffer and read the individual bytes out of memory, which is also a
>> discouraging argument from going down the synthetic children route: the
>> added value compared to the actual type layout is quite low.
>> This is why LLDB vends a summary instead of synthetic children.
> 
> I see you point.
> 
> I am approaching the issue from another perspective though: I am
> building an application with the concept of "programming-model-aware
> debugging", specifically data-flow programming model. I found the LLDB
> C++ API valuable, because it allows me to easily build an application
> that attaches to another process which performs processing of a
> data-flow graph (think multimedia processing) and displays the
> processing behavior in a graphical way. So I consider the LLDB C++ API
> as a convenient utility and abstraction on top of
> operating-system-provided debugging facilities, but for other purpose
> than implementing a classical command-line debugger. Instead, it is
> used as a utility to examine another program's state and data in a
> convenient way.
> 
> Therefore, in my use case I need to access the *actual* data to
> display it to the end-user in any arbitrary way that LLDB API should
> never need to assume in itself. Providing a summary/value of a data
> structure's contents as a string (returned by GetSummary() or
> GetValue()) is an example of such an assumption.

This is what data formatters are for.
There is an underlying ground truth which is what the DWARF type information vends.
On top of that, the data formatters enable you to vend a different reality of your own making. Whether that reality is that a vector is a container of items, or a string is a container of bytes, that is up to you.
The debugger vends two things in this area:
a) the data formatters subsystem (SBType*.h at the API level and the DataFormatters/ folder in the internals)
b) data formatters for types of interest

The second item is obviously tailored to common expectations and driven by what a user debugging an app expects to see
These formatters are pretty much constrained to live within this space of convenience and usability, or people would complain fairly loudly about it

This is where the first item comes into the picture. We cannot (or do not want) to vend any possible data view through the builtin formatters, but we vend a formatters model through which you can vend whatever formatting suits your needs
Yes, unfortunately that means that you need to know a little more about your types, but somebody needs to “bake the knowledge” in LLDB [1], whether it’s me writing a builtin formatter, or you writing your own 

[1] DWARF is still the source of ground truth, anything else you see is custom knowledge that someone implemented on top of the ground truth

What you are trying to access is really not “raw data”. Raw data for a string looks like this:
(std::__1::string) X = {
  __r_ = {
    std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, std::__1::allocator<char> > = {
      __first_ = {
         = {
          __l = {
            __cap_ = 139639644119302
            __size_ = 4294967296
            __data_ = 0x00007fff5fbffb70
          }
          __s = {
             = {
              __size_ = '\x06'
              __lx = '\x06'
            }
            __data_ = {
              [0] = 'a'
              [1] = 'b'
              [2] = 'c'
              [3] = '\0'
              [4] = '\x7f'
              [5] = '\0'
              [6] = '\0'
              [7] = '\0'
              [8] = '\0'
              [9] = '\0'
              [10] = '\0'
              [11] = '\x01'
              [12] = '\0'
              [13] = '\0'
              [14] = '\0'
              [15] = 'p'
              [16] = '?'
              [17] = '?'
              [18] = '_'
              [19] = '?'
              [20] = '\x7f'
              [21] = '\0'
              [22] = '\0'
            }
          }
          __r = {
            __words = {
              [0] = 139639644119302
              [1] = 4294967296
              [2] = 140734799805296
            }
          }
        }
      }
    }
  }
}

Your desired view of the world is:
(std::__1::string) X = {
  [0] = 'a'
  [1] = 'b'
  [2] = 'c'
  [3] = '\0'
}

This is far from raw data. And to get anything other than raw data, you need a formatter.

Enrico Granata
📩 egranata@.com
☎️ 27683

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20130828/32799863/attachment.html>