[lldb-dev] ValueObjectChild and SetData

Thu Aug 6 15:58:27 PDT 2020

Thanks for the info. Comments inlined below!

> On Aug 6, 2020, at 1:56 PM, Gabor Greif <ggreif at gmail.com> wrote:
> 
> On 8/6/20, Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> wrote:
>> 
>> 
>>> On Aug 5, 2020, at 1:50 PM, Gabor Greif via lldb-dev
>>> <lldb-dev at lists.llvm.org> wrote:
>>> 
>>> Hi LLDB devs,
>>> 
>>> short question. Since the method
>>> 
>>> bool ValueObjectChild::SetData(DataExtractor &data, Status &error)
>>> 
>>> doesn't exist, what is the preferred way to update the contents of
>>> scalar bitfields?
>>> 
>>> Is there any code in the repo demonstrating the technique?
>>> 
>>> I am interested, because for the language I am writing a plugin for
>>> certain datatypes are MSBit-aligned, e.g. a Nat16 occupies the upper
>>> portion of the bits in a (32-bit) word.
>>> 
>>> Viewing and setting of such variables thus involves shifting bits, and
>>> I'd expect that ValueObjectChild (in bitfield mode) would do that for
>>> me.
>>> 
>>> Thanks in advance for any clues,
>>> 
>>> cheers,
>>> 
>>>   Gabor
>> 
>> What is the debug information format being used for these? If it is DWARF,
>> the location expression for the variable should take care of extracting the
>> value correctly.
> 
> Hi Greg,
> 
> 
> thanks for the very elaborate answer! Please find my replies inline, below.
> 
> In my case `lldb` is reading DWARF. Things are being complicated a bit
> by the fact that I am targeting a Wasm platform (WASI), and thus the
> location of the
> formal arguments is in locals (but this is comparable to registers on
> common architectures).
> 
> Are you suggesting that the location expression should massage the
> formal parameter?
> 
> Currently I emit
> ```
> 0x000000db:     DW_TAG_formal_parameter
>                  DW_AT_name	("n")
>                  DW_AT_decl_line	(5)
>                  DW_AT_decl_column	(0x09)
>                  DW_AT_type	(0x0000002b "Nat16")
>                  DW_AT_location	(DW_OP_WASM_location 0x0 +1, DW_OP_stack_value)
> ```

DW_OP_stack_value implies that after running this expression the value of this variable exists on the DWARF stack. This should mean that the "Value" would have a ValueType of eValueTypeScalar. I am guessing when you see these variables you always get the entire integer value of all bitfields that shared this integer. Is that correct?

> and `Nat16` is defined as:
> ```
> 0x0000002b:   DW_TAG_base_type
>                DW_AT_name	("Nat16")
>                DW_AT_bit_size	(0x20)
>                DW_AT_data_bit_offset	(0x10)
>                DW_AT_encoding	(DW_ATE_unsigned)
> ```

Interesting, from reading the DWARF specification, it is legal for a DW_TAG_base_type to have a DW_AT_bit_size and DW_AT_data_bit_offset, but LLDB is currently not handling this situation. We handle these for bitfields, which are currently attached to DW_TAG_member of a struct. 

So we have two options to fix these kinds of variables:
- fix LLDB to handle the DW_AT_data_bit_offset and DW_AT_bit_size on DW_TAG_base_type types (requires LLDB fix)
- fix the DW_AT_location expression to shift and mask the integer with extra DW_OP opcodes (no fix required in LLDB)

The expression could be modified to add the data bit offset

DW_AT_location	(DW_OP_WASM_location 0x0 +1, DW_OP_stack_value, DW_OP_const1u(0x10), DW_OP_shr, DW_OP_const4u(0xffffffff), DW_OP_and)

To break this down:

This gets the integer value and places it on the stack:
  DW_OP_WASM_location 0x0 +1, DW_OP_stack_value

  stack[0] = full_value

This pushes the data bit offset onto the stack:	

  DW_OP_const1u(0x10)

  stack[0] = full_value
  stack[1] = 0x10

This shifts the full_value to the right by 0x10:

  DW_OP_shr

  stack[0] = full_value >> 0x10

Now we need to make up a mask to mask of the first DW_AT_bit_size bits:

  DW_OP_const4u(0xffffffff)

  stack[0] = full_value >> 0x10
  stack[1] = (1 << 0x20) - 1 (which makes the mask of 0xffffffff)

Now we mask off the high bits using the mask we just created

  DW_OP_and

  stack[0] = (full_value >> 0x10) & 0xffffffff

Now the value of the variable is correct.

>> 
>> A bit more info: Each ValueObject has a "Value m_value;" member variable
>> that contains the value of the variable. This "Value" object has many member
>> variables:
>> 
>> class Value {
>>  Scalar m_value;
>>  Vector m_vector;
>>  CompilerType m_compiler_type;
>>  void *m_context;
>>  ValueType m_value_type;
>>  ContextType m_context_type;
>>  DataBufferHeap m_data_buffer;
>> };
>> 
>> The "m_value_type" helps us to know how to interpret the value itself which
>> is either contained in "Scalar m_value;" or "DataBufferHeap m_data_buffer;".
>> ValueType is one of:
>> 
>>  enum ValueType {
>>    // m_value contains...
>>    // ============================
>>    eValueTypeScalar,      // raw scalar value
>>    eValueTypeVector,      // byte array of m_vector.length with endianness
>> of
>>                           // m_vector.byte_order
>>    eValueTypeFileAddress, // file address value
>>    eValueTypeLoadAddress, // load address value
>>    eValueTypeHostAddress  // host address value (for memory in the process
>> that
>>                           // is using liblldb)
>>  };
>> 
>> eValueTypeScalar means that the value itself is actually in "Scalar
>> m_value;". This is the typical way that built in types (ints, floats, chars,
>> etc) get resolved. For bitfields, this value should already be shifted
>> around as necessary when the location information from the debug info was
>> parsed and used to create the value at a specific location in the code.
> 
> I am not sure I can observe this. But I'll go hunting.
> 
>> 
>> eValueTypeFileAddress means that "m_value" contains a "file address" which
>> points to the location of the variable in memory. This value will need to be
>> converted to a "load address" when we extract the value and then we will
>> read the value from process memory each time we get the location.
>> 
>> eValueTypeLoadAddress means that "m_value" contains a "load address" which
>> points to the memory address in the process where we will read the value
>> from.
> 
> While stepping around I have seen `eValueTypeLoadAddress`.
> 
>> 
>> eValueTypeHostAddress means that "m_value" contains an address in the LLDB
>> process itself. This is typically used for variables that are constructed
>> with complex location expressions that might say "2 bytes of my value are at
>> XXX, 4 bytes of my value are in this register and 2 bytes are constant". So
>> when we evaluate the location expression, it will hand us a buffer that
>> contains the variable value.
>> 
>> So your case seems like a standard bitfield case where the debug info should
>> be adequately describing the bitfield and everything should just work. Are
>> there any reasons why you think this might not be happening?
> 
> I am new in these matters, and I'll find out. It would be awesome to not
> having a need to patch LLDB.
> 
> Thanks again, I'll come back as soon as I have better facts,
> gathered by poking around.
> 
> Cheers,
> 
>     Gabor
> 
>> 
>> Greg
>> 
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev at lists.llvm.org <mailto:lldb-dev at lists.llvm.org>
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20200806/fba08e34/attachment-0001.html>