[PATCH] D84234: [DWARFYAML] Implement the .debug_loclists section.

Thu Jul 23 02:34:54 PDT 2020

jhenderson added a comment.

The DWARF v5 standard explicitly allows for a value of 0 for the offset_entry_count, meaning there are no offset entries. My reading is that this is allowed even if there are location lists, but I don't see any testing for that situation, I think? I believe the same applies for .debug_rnglists, but I haven't looked to see if you implemented that there.

================
Comment at: llvm/lib/ObjectYAML/DWARFEmitter.cpp:446-448
+writeDWARFExpression(raw_ostream &OS,
+                     const DWARFYAML::DWARFOperation &Operation,
+                     uint8_t AddrSize, bool IsLittleEndian) {
----------------
Higuoxing wrote:
> labath wrote:
> > Higuoxing wrote:
> > > jhenderson wrote:
> > > > labath wrote:
> > > > > jhenderson wrote:
> > > > > > One of the things yaml2obj is useful for is for creating broken ELFs. I see you're adding further checks here to make sure the `Values` array is the right size for the type. But what if you want to have an incorrect number of values for a given operation?
> > > > > > 
> > > > > > There are similar issues elsewhere, I'm sure, so I'm not saying this specific patch needs changing at this point, but I do think you need to consider how to solve this case. For example, it might mean being more specific with the type of value by using distinct fields for different types. For example, instead of `Values: [0x1234, 0x4321]` you might want something like:
> > > > > > ```
> > > > > > Values:
> > > > > >   - Type: SLEB128
> > > > > >     Value: 0x1234
> > > > > >   - Type: Address
> > > > > >     Value: 0x4321
> > > > > > ```
> > > > > > or even:
> > > > > > ```
> > > > > > Values:
> > > > > >   - SLEB128: 0x1234
> > > > > >   - Address: 0x4321
> > > > > > ```
> > > > > > What do you think?
> > > > > I've been wondering about that too. I personally don't find the ability to add more/less "values" than the appropriate DW_LLE kind expects useful, because the yaml structure will have no relation to how the result ends up being parsed. And the extra info needed to make this operation well-defined makes the yaml more verbose for the normal cases.
> > > > > 
> > > > > What I would find useful is if, at certain strategic levels, it would be possible to insert an arbitrary hex blob instead of fully formed yaml. One such place could be the dwarf expression -- instead of all the DW_OP thingies we could just add a hex blob. That would also give an escape hatch for the obj2yaml path so that it can serialize any DWARF expression with unknown/unsupported opcodes as hex.
> > > > > 
> > > > > Another such level could be the level of an individual location list -- if there's anything funny about the location list, one could put the entire list as hex, even though some parts of it could be valid and representable normally. I think that's fine because this is not likely to be needed often. Or maybe this could apply to an entire debug_loclists contribution. Or both...
> > > > > 
> > > > > At the last level, one could specify the entire section contents as hex, which is sort of what we have now.
> > > > That makes sense to me. There are some cases in the ELF code, where we fall back to using hex section contents for known section types, when the content is unparseable for whatever reason.
> > > > One of the things yaml2obj is useful for is for creating broken ELFs. I see you're adding further checks here to make sure the Values array is the right size for the type. But what if you want to have an incorrect number of values for a given operation?
> > > 
> > > Sounds good! What about having an additional field called `Types`. If the `Types` field is specified, yaml2obj emits values according to the `Types` array. If it doesn't exist, yaml2obj emits values according to the spec.
> > > 
> > > ```
> > > - Operator: DW_LLE_startx_endx
> > >   Types:    [ ULEB128, Hex64 ]
> > >   Values:   [ 0x1234, 0x4321 ]
> > > ```
> > > 
> > > The benefit is that we don't have to specify the `Value` field every time. The number of operands can also be adjusted via the `Types` array.
> > > 
> > > 
> > > ```
> > > - Operator: DW_LLE_startx_endx
> > >   Types:    [ ULEB128, Hex64, Hex16 ]
> > >   Values:   [ 0x1234, 0x4321, 0x12 ]
> > > ```
> > I sort of like that because it leaves the common case succinct, but it still seems like more flexibility than needed. I am not really opposed to that -- I'm sure it can be useful at times -- but I'm not sure the use will be frequent enough to justify the added complexity (you'd still need to handle all the situations when the types array is not specified _and_ also the case where the type array is there). Since yaml tests are typically done by taking obj2yaml output and modifying/reducing it, and obj2yaml will never produce output like that, I'm not sure how many people will even discover it.
> > 
> > Nit: I don't think HexNN is a good name because it bears no relation to how the input is written (one can just as easily write the value array in decimal), and the fact that the output is going to be hex (binary) is kinda obvious. Maybe just IntNN, or SignedNN/UnsignedNN is singedness matters (I don't think it does).
> > Nit: I don't think HexNN is a good name because it bears no relation to how the input is written (one can just as easily write the value array in decimal), and the fact that the output is going to be hex (binary) is kinda obvious. Maybe just IntNN, or SignedNN/UnsignedNN is singedness matters (I don't think it does).
> 
> Oh, thanks a lot! I've upload a diff revision D84386 that illustrates my idea. Any thoughts?
I like the approach personally. I've commented more on D84386.

================
Comment at: llvm/lib/ObjectYAML/DWARFEmitter.cpp:475
+                     uint8_t AddrSize, bool IsLittleEndian) {
+  auto CheckOperands = [&](uint64_t ExpectedOperands) -> Error {
+    if (Operation.Values.size() != ExpectedOperands) {
----------------
It looks like this lambda could be using `checkListEntryOperands` with a bit of a tweak to that function's name.

================
Comment at: llvm/test/tools/yaml2obj/ELF/DWARF/debug-loclists.yaml:351
+
+## f) Test that location descriptions can be omitted in the YAML description.
+
----------------
omitted in -> omitted from

================
Comment at: llvm/test/tools/yaml2obj/ELF/DWARF/debug-loclists.yaml:426
+##                                                             ^- DW_LLE_start_length
+# ADDRSIZE32-NEXT: 0x00000020 34120000 a1860100                   4.......
+##                            ^-------                            operands[0] (4-byte)
----------------
Missing a comment for the end of list markers in both sequences?

================
Comment at: llvm/test/tools/yaml2obj/ELF/DWARF/debug-loclists.yaml:500
+
+## j) Test that an invalid address_size can be used when there are no address-using operators.
+
----------------
I think this case is tested by case e)?

================
Comment at: llvm/test/tools/yaml2obj/ELF/DWARF/debug-loclists.yaml:525-526
+
+## k) Test that yaml2obj emits an error message if we specify invalid numbers of operands
+## for a location list encoding.
+
----------------
Looking at this, I think it might be best to test the majority of this in a gtest unit test. It seems like there's a lot of detail that would better fit there. I might be wrong though. What do you think?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84234/new/

https://reviews.llvm.org/D84234