[PATCH] D80168: For --relativenames, handle dwarf absolute include directories similarly to compilation directories.

Fri May 29 01:35:55 PDT 2020

labath added a comment.

In D80168#2061590 <https://reviews.llvm.org/D80168#2061590>, @echristo wrote:

> In D80168#2061323 <https://reviews.llvm.org/D80168#2061323>, @labath wrote:
>
> > In D80168#2060851 <https://reviews.llvm.org/D80168#2060851>, @saugustine wrote:
> >
> > > In D80168#2060753 <https://reviews.llvm.org/D80168#2060753>, @aprantl wrote:
> > >
> > > > Why is it necessary to check in the binary? Could we just assemble the input with llvm-mc on the fly?
> > >
> > >
> > > Because when you assemble with llvm-mc in the usual test suite, it sets the compilation directory, include directory, and input file like so:
> > >
> > > comp_dir: location where llvm-mc is run.
> > >  include_dir: absolute path to relativenames.s in the source tree
> > >  filename: "relativenames.s"
> > >
> > > So, after the comments above that we *don't* want the remove the include dir, even if it is an absolute path, the difference between the output with relative names and without is nothing: We still get the absolute path.
> >
> >
> > You don't actually have to use the `.loc`/`.file`/etc. directives to create the line table. You can go lower level and spell out the debug_lines content directly (`.byte` et al.). That way, you should be able to create any line table you want. There are a bunch of existing tests for llvm-dwarfdump line table functionality that do this so you can get some inspiration from those.
>
>
> I think if we're worried about the behavior changing when creating and want to make sure we can still handle it then we should check in binary input (and a note to say what it came from), if not then we can assemble it up using .loc just fine? Pavel: What kinds of tests were the dwarfdump .byte ones coming from?

Well, let me start of with saying that I haven't been paying close attention to this patch (I think I have a rough idea of what it does, but I haven't looked more closely) -- my comment was going off of the "this can't be produced with llvm-mc" claim. And I was writing it in a hurry since it was getting late.

Anyway, I believe there is general consensus that tests for debug info parsing/dumping code should be written close to the form in which the parser will read them. This is to ensure we have known static data to test against, and avoid having matching bugs in the producer and consumer "cancel each other out" (produce incorrect/nonconforming output, which can still be read my a matching version of the parsing tool). I also believe there is (maybe slighly less general) consensus that we should avoid checking in binary data. The main reason I don't like those is that they are not reviewable, and often not reduced.

For most debug info sections that makes the choice quite clear -- use llvm-mc with the appropriate `.byte` directives (most of the time obtained by tweaking/reducing the output "clang -S"). I think that strikes a good balance between "ensuring a static output" and "reviewability/modifiability/etc.", and I believe most of the existing tests follow that. In this sense the debug_line table is very unique, because there are multiple ways that one can generate line tables with llvm-mc, and the way that is "normally" done is still very far from the actual on-disk representation -- the `.loc/.line` directives leave a fair amount of freedom in how they end up being implemented.

So, my point was that if we cannot get llvm-mc to reliably produce the required line tables with `.line` directives (that's what I thought it uses -- it looks like it goes even higher-level than that with `llvm-mc -g`), then a level which would offer a similar kind of compromise/experience to other debug info sections would be spelling out the line table with `.byte`s.

Now, to answer your question, the main .debug_line `.byte` tests that I've worked with are those that test error handling on invalid inputs (e.g. `tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s`). That's a pretty clear use case for this, as tables generated with `.line` should always be valid. However, they don't seem to be the only ones: e.g. `tools/llvm-dwarfdump/X86/debug-line-dw-lne-end-sequence.s`, and `tools/llvm-dwarfdump/X86/debug-line-dw-lns-copy.s` generate valid line tables but they test a very specific line table feature so they use .byte directives to ensure that it is generated. One test I wrote needed more than 256 file name entries. That could have been produced in a number of ways but a combination of `.byte` directives with assembler metaprogramming meant that the result was very short but still explicit.

That said, when I look at this test more closely, it's not clear to me whether that is really needed -- the steps in the comments (`mkdir`, `cd`) could be executed as a part of the test too, except that it would `cd` to `%T/something` instead of `/tmp`. Then the test could use the presence of `something` in the output path to check whether it is printing an absolute path or not. However I don't think that going the `.byte` route would be bad either. There's also a unit-test framework (see `unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp`) that could also be used to generate interesting line tables, but I am personally not very fond of it. However, using a binary file does not seem like a good choice to me, because of its opaqueness.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80168/new/

https://reviews.llvm.org/D80168