[cfe-dev] [RFC] New ClangDebuggerSupport Library

Tue Dec 13 11:18:24 PST 2016

> On Dec 13, 2016, at 10:50 AM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Tue, Dec 13, 2016 at 10:42 AM Chris Bieneman <cbieneman at apple.com <mailto:cbieneman at apple.com>> wrote:
>> On Dec 13, 2016, at 8:59 AM, David Blaikie via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>> 
>> 
>> 
>> On Mon, Dec 12, 2016 at 4:59 PM Chris Bieneman <cbieneman at apple.com <mailto:cbieneman at apple.com>> wrote:
>>> On Dec 12, 2016, at 4:40 PM, David Blaikie via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Dec 12, 2016 at 4:23 PM Chris Bieneman <cbieneman at apple.com <mailto:cbieneman at apple.com>> wrote:
>>>> On Dec 12, 2016, at 4:13 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Mon, Dec 12, 2016 at 4:09 PM Chris Bieneman <cbieneman at apple.com <mailto:cbieneman at apple.com>> wrote:
>>>> David, the two approaches address very different problems.
>>>> 
>>>> The YAML tools are focused on a bit-for-bit identical round trip path for DWARF into and out of YAML. The goal with that work is to be able to generate a test suite from the output of many different versions of many different compilers. This is specifically with the goal of creating LIT-style tests that read DWARF and operate on it.
>>>> 
>>>> Ah, thanks for explaining.
>>>> 
>>>> These tests wouldn't appear in LLVM/Clang's test suite, then, right? So normal regression tests for the ClangDebuggerSupport library would be written as unit tests using Greg's DWARF-generation library?
>>> 
>>> My goal is actually to have reduced test cases based on the YAML tools in the clang test suite. LLDB's use of clang APIs with dwarf generated by mismatched compilers is the source of many issues for the debugger, so having basic testing of DWARF generated by alternate compilers in Clang is highly desirable.
>>> 
>>> Well, having DWARF that's representative of that generated by alternate compilers is important - and it seems like Greg's work on the unit test API for creating DWARF should still allow that. Seems reasonable to continue to enhance that to produce any DWARF we care about (since we'll need to generate it to test the DWARF parsing APIs - so that's a prerequisite before we worry about whether the ClangDebuggerSupport library can do something sensible with it, right?)
>> 
>> I haven't dug too deep into Greg's work (although I certainly will). Where it makes sense I may even try and leverage his APIs in the YAML tools (as I have been leveraging the existing DWARF parser).
>> 
>> In my (limited) discussions with Greg, it didn't seem like creating bit-for-bit identical DWARF was something his APIs were suited to.
>> 
>> This seems strange to me that it would not be a need for his work, yet be a need for yours. Your work would presumably layer on top of his (got to parse the DWARF first, before you can build Clang ASTs from it).
> 
> The dwarfgen APIs are designed around being able to create a DIE and have the APIs create the accompanying abbreviations. That means you don't have direct control over the bit layouts.
> 
> One of the complications here is that in DWARF there is no "one true way" to encode to binary. Since my test need to test the outputs of other compilers, I really need bit-for-bit identical translations.
> 
>> 
>> Any irregular DWARF is going to be /more/ interesting (weird bit twiddling, etc) for the DWARF parsing API testing (because most of it would just error out, and that which doesn't would need to have positive test coverage as well) than for the Clang AST generation part that happens after that.
> 
> Bit twiddling of individual fields should be possible in the dwarfgen APIs, but there is no API for explicit specification of DWARF bytes other than just encoding a byte array (like we do in the disassembler tests).
> 
>> 
>> It seems backwards to need more fidelity/control when creating inputs for testing the Clang AST generation, than for the API below it for parsing DWARF.
> 
> I don't disagree with your assertion here. I need bit-for-bit identical data encodings, so I'm writing a solution that provides that.
> 
> OK - so you disagree with Greg's approach to testing the underlying APIs your tests rely on?

That's not entirely what I said. I think Greg's approach is appropriate for a specific subset of tests. Specifically validating the DWARF parser with known-good DWARF data in a reasonable format. I'm testing much more complicated inputs.

> 
> Perhaps you two could hash that out, since you're working in the same space/same project/company/etc. This seems inconsistent and I think it's reasonable to ask for the strategy here to become consistent in some way.
> 
> I think that way would probably be to make the dwarfgen APIs sufficiently descriptive to be able to generate the inputs you require (& to convince Greg that he needs to test those inputs too/instead - many of them should just be error cases in LLVM's DWARF parsing APIs, before it even reaches your ClangDebuggerSupport code - and those that aren't, should probably still have full fidelity testing in the DWARF parsing APIs before reaching the Clang side) and just use that.
> 
> We don't generally generate uneditable test cases for LLVM projects (yeah, I've done some of that - we have binaries checked in to test llvm-dwarfdump, but that's being addressed by this whole discussion)

This is very not true for the tests that have object-file inputs, and the YAML based tests aren't completely un-editable. I've found that the object file YAML tests can be reduced, simplified, and the fields of the entries can be edited.

> 
> Here's what I'd suggest/have in mind: starting with dwarfgen, and seeing how it works for creating the sort of test cases you have in mind. When there are particular difficulties, then it might be good to look at the concrete examples and decide how best to approach them. I'm sure there's lots of improvements, higher level APIs, etc, we could make around the dwarfgen code (creating DWARF is verbose - in any format, I think - helpers and utilities could be really handy, for sure).

Dwarfgen cannot create the types of test I need to create, and having to write the tests in API calls will be wildly unwieldy for example one of the expression parser tests in LLDB (issue_11588) generates over 6k lines of dwarfdump output. To re-generate that DWARF in API calls would be a significant burden to writing a test case.

Having the ability to convert it to YAML and back to identical DWARF makes the test case generation easy.

Having test cases as YAML instead of binary makes them at least human readable.

I'm not opposed to using the dwarfgen APIs when generating the binary from YAML if the APIs make sense, but the YAML is useful.

-Chris

> 
> - Dave
>  
> 
> -Chris
> 
>>  
>> 
>> In YAML I've made the textual representation mirror the binary representation to a degree that the translation from YAML to binary has very little logic to it. As a point of context the YAML->DWARF implementation for dumping debug_abbrev, debug_str, and debug_aranges is under 100 lines of code.
>> 
>>>  
>>> 
>>>> 
>>>> Large tests generated from other compilers on raw source I would expect to appear in something like the test-suite, rather than in an LLVM project's regression or unit test suite.
>>> 
>>> Large tests will certainly not be included in the clang test suite. YAML representations of DWARF should enable us to make reduced test cases in many situations, and where we cannot we will put the test in an external suite.
>>> 
>>>> 
>>>> Why the need for round tripping, then? Would it be sufficient for the test-suite to have binaries checked in next to info about what compiler generated them?
>>> 
>>> The benefit of supporting round tripping in and out of a text-based format is that we may be able to reduce the test cases to things that we can include in the Clang test suite.
>>> 
>>>> (& why not just have the source checked in & run a variety of buildbot configurations (or one meta-configuration that could enumerate a variety of compilers) with different host compilers to test the behavior? That's how GDB's test suite works (for better and worse, don't get me wrong - there are things that could be improved from that position))
>>> 
>>> This is actually basically how the LLDB test suite works. There is one huge drawback to this. Not everyone has access to every compiler we want to support, and certainly most people don't have them all installed. As a result having source-based tests means that many people may not be able to reproduce test failures locally. Using YAML encodings to generate the binary DWARF removes the compiler from the picture, and allows everyone to test every compiler's output.
>>> 
>>> Fair - so why YAML rather than something more like the unit tests Greg's working on in LLVM?
>> 
>> I mostly gravitated to YAML because I have experience using YAML-based tests for libObject code, and have found it very useful to be able to translate binaries in and out of YAML for testing.
>> 
>>> 
>>> (this is clearly my preference - to use the unit test type API, since in both Greg and your case, you're testing an API, not a tool, so it seems cool/fine/reasonable to have an API for generating the input.
>> 
>> I actually expect in my use case that I'll be testing both APIs and one or more tools. My intention is to write a tool that reads dwarf and dumps Clang ASTs. For that purpose having a YAML->DWARF generator is ideal.
>> 
>> Also for my use case YAML has an added advantage that when a user reports an issue I can either take a binary or YAML file from the user, and textually reduce that down to a test case which could live in-tree.
>> 
>>> 
>>> But the alternative question would be: Why not test the LLVM DWARF parsing API Greg's testing, with this yaml input instead of the unit test API?)
>> 
>> Personally, I think having both types of tests are valuable. Unit tests of APIs are particularly valuable for writing small-grained tests, with limited input sizes. When I start running down the path of constructing Clang ASTs from complex C++ programs the code required to generate that DWARF in a unit test could be substantial, and that would make it a lot harder to write tests.
>> 
>> Converting a binary to a YAML file is easy, hand crafting DWARF from APIs might not be.
>> 
>> -Chris
>> 
>>>  
>>> 
>>> -Chris
>>> 
>>>> 
>>>> - Dave
>>>>  
>>>> 
>>>> -Chris
>>>> 
>>>>> On Dec 12, 2016, at 3:57 PM, David Blaikie via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>>> 
>>>>> I realize work is already underway/being committed here, but figured discussing the following in this thread rather than on some random commit email might be better.
>>>>> 
>>>>> We now have two ways of generating DWARF, both committed in relation to a similar effort to integrate LLDB better with teh rest of the LLVM project.
>>>>> 
>>>>> There's this YAML effort, to help test the library that will allow the generation of Clang ASTs from DWARF. (currently such code resides in LLDB, and it's proposing to be rolled up into Clang here)
>>>>> 
>>>>> Then there's Greg's effort to provide a unit test API for generating DWARF for unit testing LLVM's DWARF parsing APIs for use in LLDB (currently what LLVM has was a fork of LLDB's, and Greg's working on reconciling that, rolling in LLDB's post-fork features, then migrating LLDB to use the fully featured LLVM version)
>>>>> 
>>>>> Why are these done in two different ways? They seem like really similar use cases - generating DWARF for the purpose of testing some (LLVM or Clang) API that consumes DWARF bytes.
>>>>> 
>>>>> Could we resolve this in favor of one approach or the other - I'm somewhat partial to the API approach & writing unit tests against the ClangDebuggerSupport library, myself.
>>>>> 
>>>>> - David
>>>>> 
>>>>> On Wed, Nov 9, 2016 at 2:26 PM Chris Bieneman via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>>>> Hello cfe-dev,
>>>>> 
>>>>> I would like to propose a new Clang library for implementing functionality that is used by LLDB. I see this as the first step in a long process of refactoring the language interfaces for LLDB.
>>>>> 
>>>>> The short-term goal is for this library is to be a place for us to rebuild functionality that exists in LLDB today and relies heavily on the implementation of Clang. As we rebuild the functionality we will build a suite of testing tools in Clang that exercise this library and more general Clang functionality in the same ways that LLDB will.
>>>>> 
>>>>> As bits of functionality become fully implemented and tested, we will migrate LLDB to using the Clang implementations, allowing LLDB to remove its own copies. This will provide the Clang community with a higher confidence that changes in Clang do not break LLDB, and it will provide LLDB with better test coverage of the Clang functionality.
>>>>> 
>>>>> The long-term goal of this library is to provide the implementation for what could some day become a defined debugger<->frontend interface for providing modularized (maybe even plugin-based) language debugging support in LLDB. In the distant future I could see us being able to tell people building new frontends that we have a defined interface they need to implement for the debugger, and once implemented the debugger should “Just Work”.
>>>>> 
>>>>> The first bit of functionality that I would like to build up into the ClangDebuggerSupport library is materialization of Clang AST types from DWARF. To support this development I intend to add a new tool in Clang that reads DWARF types, generates a Clang AST, and prints the AST. I will also add DWARF support to obj2yaml and yaml2obj, so we will be able to write YAML LIT tests for the functionality.
>>>>> 
>>>>> If people are in favor of this general approach I’ll begin working in this direction, and I’ll probably add the new library sometime next month.
>>>>> 
>>>>> Thoughts?
>>>>> -Chris
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>>> _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161213/02a7b973/attachment.html>