[PATCH] D115973: [llvm-profgen] Support symbol loading for debug fission

Mon Feb 14 18:01:30 PST 2022

dblaikie added a comment.

In D115973#3321423 <https://reviews.llvm.org/D115973#3321423>, @wlei wrote:

> In D115973#3321118 <https://reviews.llvm.org/D115973#3321118>, @dblaikie wrote:
>
>> Bit of history - way back in the day, we used checked in binaries (objects or executables) to test binary utilities like `llvm-symbolizer` (the thinking being that that was the input to the tool, we didn't want the test to cover more/unrelated tools (like assemblers) so that they were isolated and wouldn't fail for unrelated reasons (like the assembler being broken)) This made tests hard to read, etc.
>>
>> Eventually (I think with lld testing, maybe before) that softened to "let's check in assembly and depend on the LLVM integrated assembler to assemble those into binaries and test the resulting binaries" - the thinking being that the assembler is simple/robust enough, and the readability/maintainability improvement for the tests was worth the small chance of unrelated failures.
>>
>> But it's not easy to hand-craft DWARF assembly (or some other object file features) - so they're still kind of read-only tests, usually generated from some source code provided in a comment or in a side file (like one in Inputs, that isn't actually read by any test, but exists there for documentation and if the test assembly needs to be modified/regenerated).
>>
>> There's also yaml2obj - which I think lets you make somewhat hand-writable DWARF for some DWARF features at least and more object/non-DWARF features - which, if possible, would be preferred and it'd be great if it were improved to the point where lots of these object utility tests could be written in terms of hand-crafted/directly editable yaml2obj inputs - but I don't think it has enough feature coverage for that to entirely be the case yet. Probably more "you could generate an object file, obj2yaml it, then maybe it'd be useful for minor edits/easier to read than raw assembly, but you'd probably still want to include the source/compilation command in a comment or side file for posterity/documentation/etc".
>>
>>> You see, in llvm-symbolizer test https://github.com/llvm/llvm-project/tree/main/llvm/test/tools/llvm-symbolizer/Inputs, there're still many exe for split dwarf. This is exactly the same to this patch, I'm curious is there any concerns not moving llvm-symbolizer test into cross-project-tests? cc  @jhenderson @dblaikie
>>
>> Yeah, cross-project-tests in my fairly stern opinion, should not be a substitute for in-tree tests. All functionality should include in-subproject testing where the functionality resides (if you touch clang source in a way that's observable (ie: not NFC) - then there should be a clang test that goes with it). cross-project-tests can be used for integration. (eg: if you test that clang generates this IR, and you test that LLVM optimizes that IR to something important - you could have a cross-project-test that verifies that compiling some source code with clang generates that important assembly, so that the handshake between clang and LLVM doesn't get out of sync/lost).
>
> @dblaikie This's very informative, I have learned more contexts, thanks!
>
> I forgot to provide the special context about this testing: the purpose of this testing is to test whether llvm-profgen can use split dwarf binary as input to generate a non-empty profile, it doesn't really care about the source code and it doesn't intend to test any debug info features or the integration(clang or lld), we should have other dedicated testings for them. Hence, to narrow the scope, it just used a very simple source code(with building instructions in comments), it's very unlikely to break any debug info features or integration, I have never got any related failures in other tests of llvm-profgen before. So I lean toward keeping it into the in-tree tests.
>
> Also as shown in this version, it seems it doesn't improve the readability of the test using yaml2obj or assembly. While testing llvm-profgen, it's actually designed for read-only. People reproducing this need to call the clang cmd offline to generate the obj/exe, call obj2yaml, then spend some time to reduce the file, which seems inconvenient.

I've missed a step here - it'd still be desirable, if possible, to write the test with something like yaml2obj to make the input human readable so it would be easier to understand the test - the actual inputs and expected behavior. Though the current yaml isn't a great representation of that (I think the current yaml in this patch is probably worse than assembly - but there are maybe better options): you can see how to write DWARF in yaml in this test file: `llvm/test/tools/yaml2obj/ELF/DWARF/debug-info.yaml` - I'm not sure it's still /ideal/ for writing test cases, so it'd be interesting to see how it looks for an issue like this. (it might not/probably doesn't support things like dwp indexes... and having those as raw hex strings doesn't sound ideal, but I guess we don't have anything that generates annotated/legible assembly for them either... )

But, yeah, if all the tools we have at the moment don't make for a moderately readable textual form, I wouldn't fundamentally object to checked in object files and source code/repro steps. Possible that the missing element right now is textual/readable dwp support, so maybe that's the limit/justification for using checked in binaries for this issue.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115973/new/

https://reviews.llvm.org/D115973