[lldb-dev] Help with C++ API

Thu Aug 29 11:07:02 PDT 2013

On Aug 28, 2013, at 7:15 PM, Jakob Leben <jakob.leben at gmail.com> wrote:

> On Wed, Aug 28, 2013 at 5:08 PM, <jingham at apple.com> wrote:
>> 
>> So it would be narrowing to view the "synthetic children" as only a framework for viewing container types.
> 
> Perhaps what I've missed is that there can be several different
> synthetic child providers for any parent data type. Even in that case,
> it seems there is one such provider that's installed on an SBValue
> provided via C++ API by default. So then my proposal applies to these
> default providers.
> 

There can only be one synthetic provider per type that is active.
If you end up installing multiple, through regexp or whatnot, the order of categories determines which one wins.
Technically, a category can only ever contain one provider per type. With regular expressions it is fairly easy to violate this requirement. What happens then is undefined (i.e. it depends on the order in which they were added or in which the iterators we use provide them to us for inspection, …) long story short: don’t rely on any such tricks.

> Besides, it's not that SBValue for std::string provides synthetic
> children in a different way than I would like. The issue is that it
> doesn't provide synthetic children at all! And so far I simply haven't
> heard any good reason why it shouldn't by default provide characters
> as children.

It is a largely uninteresting view for most people. The majority of people using LLDB have never expressed the desire to twist their strings open and see an array of characters. Actually, there has been an opposite drive: even a char[] should be displayed as a string.
This is really the only reason why that was not implemented.

On Aug 28, 2013, at 7:56 PM, Jakob Leben <jakob.leben at gmail.com> wrote:

> Enrico, I'd be grateful if you could help me with this!

Assuming you work with libstdc++, your string data is located at <strVariable>._M_dataplus._M_p

You are going to have to implement a Python class, as described at: http://lldb.llvm.org/varformats.html

class SyntheticChildrenProvider:
    def __init__(self, valobj, internal_dict):
        this call should initialize the Python object using valobj as the variable to provide synthetic children for 
    def num_children(self): 
        this call should return the number of children that you want your object to have 
    def get_child_index(self,name): 
        this call should return the index of the synthetic child whose name is given as argument 
    def get_child_at_index(self,index): 
        this call should return a new LLDB SBValue object representing the child at the index given as argument 
    def update(self): 
        this call should be used to update the internal state of this Python object whenever the state of the variables in LLDB changes.[1]
    def has_children(self): 
        this call should return True if this object might have children, and False if this object can be guaranteed not to have children.[2]

What you are probably going to do is:
- save the valobj as an ivar of the children provider in __init__
- in update, you should grab the value of _M_p (a pointer) and save it somewhere. That is going to be your real data source. Return None from update. Really. It’s much safer :)

    def __init__(self, valobj, internal_dict):
	self.value = valobj
    def update(self): 
	self.ptr_value = self.value.GetChildMemberWithName(“_M_dataplus”).GetChildMemberWithName(“_M_p”).GetValueAsUnsigned(0)

To actually compute your number of children (len of string), you can do one of two things:
 - call strlen() as an expression; it is unsafe
 - read chunks of data until a \0 is encountered, or you realize you are reading way too much and you should bail; this latter case can be fairly common with uninitialized data, and anything that deals with potentially bogus data needs to be hardened against such events, or suffer great pain down the road
What you technically want is a strlen-with-bounds, that will fail if the length is waaaaay beyond reasonable. We can work details here.

As for get_child_index, I am assuming you will want to call your children [0], [1], …
It is a simple matter of rejecting any name other than those formed as [number], and for those that are well formed, return the number token
Internally, LLDB likes to deal with children by index. But for us humans, it is highly convenient to name children, since we are better at memorizing names than indexes. This call allows LLDB to answer the question “when I am asked for child foo, where do I find it?”. Natural ordering of the DWARF debug information would answer this in the “raw data” case.

Now, the real deal. In get_child_at_index you are going to extract the individual byte.
You can fairly simply go to the process, read a byte at ptr_value+index, and use an expression to make that into a char.
There are more interesting/efficient ways to get the same result, which would involve retrieving the pointee type for _M_p and using that to build an SBValue. Feel free to ask if you want to delve more

has_children is fairly easy: return True and be done with it.

Hope these pointers help you get started!

Enrico Granata
📩 egranata@.com
☎️ 27683

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20130829/48ccc202/attachment.html>