[lldb-dev] ValueObjectChild and SetData
Greg Clayton via lldb-dev
lldb-dev at lists.llvm.org
Thu Aug 6 15:58:27 PDT 2020
Thanks for the info. Comments inlined below!
> On Aug 6, 2020, at 1:56 PM, Gabor Greif <ggreif at gmail.com> wrote:
>
> On 8/6/20, Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>>
>>
>>> On Aug 5, 2020, at 1:50 PM, Gabor Greif via lldb-dev
>>> <lldb-dev at lists.llvm.org> wrote:
>>>
>>> Hi LLDB devs,
>>>
>>> short question. Since the method
>>>
>>> bool ValueObjectChild::SetData(DataExtractor &data, Status &error)
>>>
>>> doesn't exist, what is the preferred way to update the contents of
>>> scalar bitfields?
>>>
>>> Is there any code in the repo demonstrating the technique?
>>>
>>> I am interested, because for the language I am writing a plugin for
>>> certain datatypes are MSBit-aligned, e.g. a Nat16 occupies the upper
>>> portion of the bits in a (32-bit) word.
>>>
>>> Viewing and setting of such variables thus involves shifting bits, and
>>> I'd expect that ValueObjectChild (in bitfield mode) would do that for
>>> me.
>>>
>>> Thanks in advance for any clues,
>>>
>>> cheers,
>>>
>>> Gabor
>>
>> What is the debug information format being used for these? If it is DWARF,
>> the location expression for the variable should take care of extracting the
>> value correctly.
>
> Hi Greg,
>
>
> thanks for the very elaborate answer! Please find my replies inline, below.
>
> In my case `lldb` is reading DWARF. Things are being complicated a bit
> by the fact that I am targeting a Wasm platform (WASI), and thus the
> location of the
> formal arguments is in locals (but this is comparable to registers on
> common architectures).
>
> Are you suggesting that the location expression should massage the
> formal parameter?
>
> Currently I emit
> ```
> 0x000000db: DW_TAG_formal_parameter
> DW_AT_name ("n")
> DW_AT_decl_line (5)
> DW_AT_decl_column (0x09)
> DW_AT_type (0x0000002b "Nat16")
> DW_AT_location (DW_OP_WASM_location 0x0 +1, DW_OP_stack_value)
> ```
DW_OP_stack_value implies that after running this expression the value of this variable exists on the DWARF stack. This should mean that the "Value" would have a ValueType of eValueTypeScalar. I am guessing when you see these variables you always get the entire integer value of all bitfields that shared this integer. Is that correct?
> and `Nat16` is defined as:
> ```
> 0x0000002b: DW_TAG_base_type
> DW_AT_name ("Nat16")
> DW_AT_bit_size (0x20)
> DW_AT_data_bit_offset (0x10)
> DW_AT_encoding (DW_ATE_unsigned)
> ```
Interesting, from reading the DWARF specification, it is legal for a DW_TAG_base_type to have a DW_AT_bit_size and DW_AT_data_bit_offset, but LLDB is currently not handling this situation. We handle these for bitfields, which are currently attached to DW_TAG_member of a struct.
So we have two options to fix these kinds of variables:
- fix LLDB to handle the DW_AT_data_bit_offset and DW_AT_bit_size on DW_TAG_base_type types (requires LLDB fix)
- fix the DW_AT_location expression to shift and mask the integer with extra DW_OP opcodes (no fix required in LLDB)
The expression could be modified to add the data bit offset
DW_AT_location (DW_OP_WASM_location 0x0 +1, DW_OP_stack_value, DW_OP_const1u(0x10), DW_OP_shr, DW_OP_const4u(0xffffffff), DW_OP_and)
To break this down:
This gets the integer value and places it on the stack:
DW_OP_WASM_location 0x0 +1, DW_OP_stack_value
stack[0] = full_value
This pushes the data bit offset onto the stack:
DW_OP_const1u(0x10)
stack[0] = full_value
stack[1] = 0x10
This shifts the full_value to the right by 0x10:
DW_OP_shr
stack[0] = full_value >> 0x10
Now we need to make up a mask to mask of the first DW_AT_bit_size bits:
DW_OP_const4u(0xffffffff)
stack[0] = full_value >> 0x10
stack[1] = (1 << 0x20) - 1 (which makes the mask of 0xffffffff)
Now we mask off the high bits using the mask we just created
DW_OP_and
stack[0] = (full_value >> 0x10) & 0xffffffff
Now the value of the variable is correct.
>>
>> A bit more info: Each ValueObject has a "Value m_value;" member variable
>> that contains the value of the variable. This "Value" object has many member
>> variables:
>>
>> class Value {
>> Scalar m_value;
>> Vector m_vector;
>> CompilerType m_compiler_type;
>> void *m_context;
>> ValueType m_value_type;
>> ContextType m_context_type;
>> DataBufferHeap m_data_buffer;
>> };
>>
>> The "m_value_type" helps us to know how to interpret the value itself which
>> is either contained in "Scalar m_value;" or "DataBufferHeap m_data_buffer;".
>> ValueType is one of:
>>
>> enum ValueType {
>> // m_value contains...
>> // ============================
>> eValueTypeScalar, // raw scalar value
>> eValueTypeVector, // byte array of m_vector.length with endianness
>> of
>> // m_vector.byte_order
>> eValueTypeFileAddress, // file address value
>> eValueTypeLoadAddress, // load address value
>> eValueTypeHostAddress // host address value (for memory in the process
>> that
>> // is using liblldb)
>> };
>>
>> eValueTypeScalar means that the value itself is actually in "Scalar
>> m_value;". This is the typical way that built in types (ints, floats, chars,
>> etc) get resolved. For bitfields, this value should already be shifted
>> around as necessary when the location information from the debug info was
>> parsed and used to create the value at a specific location in the code.
>
> I am not sure I can observe this. But I'll go hunting.
>
>>
>> eValueTypeFileAddress means that "m_value" contains a "file address" which
>> points to the location of the variable in memory. This value will need to be
>> converted to a "load address" when we extract the value and then we will
>> read the value from process memory each time we get the location.
>>
>> eValueTypeLoadAddress means that "m_value" contains a "load address" which
>> points to the memory address in the process where we will read the value
>> from.
>
> While stepping around I have seen `eValueTypeLoadAddress`.
>
>>
>> eValueTypeHostAddress means that "m_value" contains an address in the LLDB
>> process itself. This is typically used for variables that are constructed
>> with complex location expressions that might say "2 bytes of my value are at
>> XXX, 4 bytes of my value are in this register and 2 bytes are constant". So
>> when we evaluate the location expression, it will hand us a buffer that
>> contains the variable value.
>>
>> So your case seems like a standard bitfield case where the debug info should
>> be adequately describing the bitfield and everything should just work. Are
>> there any reasons why you think this might not be happening?
>
> I am new in these matters, and I'll find out. It would be awesome to not
> having a need to patch LLDB.
>
> Thanks again, I'll come back as soon as I have better facts,
> gathered by poking around.
>
> Cheers,
>
> Gabor
>
>>
>> Greg
>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev at lists.llvm.org <mailto:lldb-dev at lists.llvm.org>
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20200806/fba08e34/attachment-0001.html>
More information about the lldb-dev
mailing list