[cfe-dev] [RFC] New ClangDebuggerSupport Library

Tue Dec 13 13:38:53 PST 2016

On Tue, Dec 13, 2016 at 11:52 AM Chris Bieneman <cbieneman at apple.com> wrote:

> On Dec 13, 2016, at 11:29 AM, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>
>
> On Tue, Dec 13, 2016 at 11:18 AM Chris Bieneman <cbieneman at apple.com>
> wrote:
>
> On Dec 13, 2016, at 10:50 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Tue, Dec 13, 2016 at 10:42 AM Chris Bieneman <cbieneman at apple.com>
> wrote:
>
> On Dec 13, 2016, at 8:59 AM, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>
>
> On Mon, Dec 12, 2016 at 4:59 PM Chris Bieneman <cbieneman at apple.com>
> wrote:
>
> On Dec 12, 2016, at 4:40 PM, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>
>
> On Mon, Dec 12, 2016 at 4:23 PM Chris Bieneman <cbieneman at apple.com>
> wrote:
>
> On Dec 12, 2016, at 4:13 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Dec 12, 2016 at 4:09 PM Chris Bieneman <cbieneman at apple.com>
> wrote:
>
> David, the two approaches address very different problems.
>
> The YAML tools are focused on a bit-for-bit identical round trip path for
> DWARF into and out of YAML. The goal with that work is to be able to
> generate a test suite from the output of many different versions of many
> different compilers. This is specifically with the goal of creating
> LIT-style tests that read DWARF and operate on it.
>
>
> Ah, thanks for explaining.
>
> These tests wouldn't appear in LLVM/Clang's test suite, then, right? So
> normal regression tests for the ClangDebuggerSupport library would be
> written as unit tests using Greg's DWARF-generation library?
>
>
> My goal is actually to have reduced test cases based on the YAML tools in
> the clang test suite. LLDB's use of clang APIs with dwarf generated by
> mismatched compilers is the source of many issues for the debugger, so
> having basic testing of DWARF generated by alternate compilers in Clang is
> highly desirable.
>
>
> Well, having DWARF that's representative of that generated by alternate
> compilers is important - and it seems like Greg's work on the unit test API
> for creating DWARF should still allow that. Seems reasonable to continue to
> enhance that to produce any DWARF we care about (since we'll need to
> generate it to test the DWARF parsing APIs - so that's a prerequisite
> before we worry about whether the ClangDebuggerSupport library can do
> something sensible with it, right?)
>
>
> I haven't dug too deep into Greg's work (although I certainly will). Where
> it makes sense I may even try and leverage his APIs in the YAML tools (as I
> have been leveraging the existing DWARF parser).
>
> In my (limited) discussions with Greg, it didn't seem like creating
> bit-for-bit identical DWARF was something his APIs were suited to.
>
>
> This seems strange to me that it would not be a need for his work, yet be
> a need for yours. Your work would presumably layer on top of his (got to
> parse the DWARF first, before you can build Clang ASTs from it).
>
>
> The dwarfgen APIs are designed around being able to create a DIE and have
> the APIs create the accompanying abbreviations. That means you don't have
> direct control over the bit layouts.
>
> One of the complications here is that in DWARF there is no "one true way"
> to encode to binary. Since my test need to test the outputs of other
> compilers, I really need bit-for-bit identical translations.
>
>
> Any irregular DWARF is going to be /more/ interesting (weird bit
> twiddling, etc) for the DWARF parsing API testing (because most of it would
> just error out, and that which doesn't would need to have positive test
> coverage as well) than for the Clang AST generation part that happens after
> that.
>
>
> Bit twiddling of individual fields should be possible in the dwarfgen
> APIs, but there is no API for explicit specification of DWARF bytes other
> than just encoding a byte array (like we do in the disassembler tests).
>
>
> It seems backwards to need more fidelity/control when creating inputs for
> testing the Clang AST generation, than for the API below it for parsing
> DWARF.
>
>
> I don't disagree with your assertion here. I need bit-for-bit identical
> data encodings, so I'm writing a solution that provides that.
>
>
> OK - so you disagree with Greg's approach to testing the underlying APIs
> your tests rely on?
>
>
> That's not entirely what I said. I think Greg's approach is appropriate
> for a specific subset of tests. Specifically validating the DWARF parser
> with known-good DWARF data in a reasonable format. I'm testing much more
> complicated inputs.
>
>
> Perhaps you two could hash that out, since you're working in the same
> space/same project/company/etc. This seems inconsistent and I think it's
> reasonable to ask for the strategy here to become consistent in some way.
>
> I think that way would probably be to make the dwarfgen APIs sufficiently
> descriptive to be able to generate the inputs you require (& to convince
> Greg that he needs to test those inputs too/instead - many of them should
> just be error cases in LLVM's DWARF parsing APIs, before it even reaches
> your ClangDebuggerSupport code - and those that aren't, should probably
> still have full fidelity testing in the DWARF parsing APIs before reaching
> the Clang side) and just use that.
>
> We don't generally generate uneditable test cases for LLVM projects (yeah,
> I've done some of that - we have binaries checked in to test
> llvm-dwarfdump, but that's being addressed by this whole discussion)
>
>
> This is very not true for the tests that have object-file inputs, and the
> YAML based tests aren't completely un-editable. I've found that the object
> file YAML tests can be reduced, simplified, and the fields of the entries
> can be edited.
>
>
> Here's what I'd suggest/have in mind: starting with dwarfgen, and seeing
> how it works for creating the sort of test cases you have in mind. When
> there are particular difficulties, then it might be good to look at the
> concrete examples and decide how best to approach them. I'm sure there's
> lots of improvements, higher level APIs, etc, we could make around the
> dwarfgen code (creating DWARF is verbose - in any format, I think - helpers
> and utilities could be really handy, for sure).
>
>
> Dwarfgen cannot create the types of test I need to create,
>
>
> I continue to be confused by this. If there are inputs you need that
> aren't possible to create with dwarfgen - doesn't that mean there are holes
> in the LLVM DWARF parsing API tests?
>
>
> Absolutely, and that should be addressed, but I think that is separate
> from what I'm trying to do. I really don't think we should hold up
> development on a testing method that will cover my needs based on a need to
> overhaul an existing infrastructure that explicitly was not designed to
> cover my needs.
>
> If you disagree with the fundamental design of the dwarfgen APIs, that's
> fine. I wouldn't have done it that way myself, but these are two radically
> different approaches to testing.
>
>
> Why would the AST generation tests need more variation in inputs than the
> underlying DWARF parser tests? It seems to me the exact opposite would be
> desired (the DWARF parser tests would test negative/parse failure cases,
> plus the good cases - the AST generation tests would only need the good
> cases)
>
>
> I'm not arguing that they do. I'm just saying the current testing for the
> underlying APIs is insufficient for me to build testing infrastructure for
> the AST generation. I also don't think it is the right building block for
> what I need. Making dwarfgen work for what I need basically means building
> a completely new set of APIs for generating explicit DWARF.
>
>
> The AST generation test would need a different kind of variety, for sure -
> it would matter to them the difference between inheriting from a
> declaration and a definition - the DWARF parsing code doesn't care about
> that, it's just DIEs and attributes.
>
> But none of that would mean being able to generate something for an AST
> generation test that we couldn't generate for a DWARF parsing test.
>
>
> Sure, we just don't have any DWARF parsing tests (and Greg wasn't planning
> any), that generate the kind of DWARF I need.
>
>
>
> and having to write the tests in API calls will be wildly unwieldy for
> example one of the expression parser tests in LLDB (issue_11588) generates
> over 6k lines of dwarfdump output.
>
>
> We don't generally test cases like that in LLVM or Clang - we have a small
> handful of them, when necessary (& I assume if you were to write the API
> for this - you could do so programmaticalyl, so it would actually
> potentially be more compact/readable than the raw dump)
>
>
> The YAML is actually fairly compact and readable. I would actually argue
> that since it is structured to match the actual data layout on disk that it
> is more readable that a blob of API calls.
>
> While I'm not intending to add large crazy test case like the one I
> pointed to in LLDB to the Clang test suite. I may want to add tests like
> that to an LLDB external test suite. So I really want the tools to create
> and work with such test cases.
>
> One of my goals in creating this infrastructure is making it so that
> engineers can test every compiler LLDB supports even if they don't have
> that compiler installed.
>

I don't think we're making a lot of headway in this thread - seems to just
be going back and forth. Perhaps we can chat in person (Zach mentioned some
notion of some in person meetings at Google on Thursday - maybe that would
suit if you're around for that), possibly rope in Richard Smith (as code
owner of Clang), if need be to help resolve this.

- Dave

>
> -Chris
>
>
>
> To re-generate that DWARF in API calls would be a significant burden to
> writing a test case.
>
> Having the ability to convert it to YAML and back to identical DWARF makes
> the test case generation easy.
>
> Having test cases as YAML instead of binary makes them at least human
> readable.
>
> I'm not opposed to using the dwarfgen APIs when generating the binary from
> YAML if the APIs make sense, but the YAML is useful.
>
> -Chris
>
>
> - Dave
>
>
>
> -Chris
>
>
>
>
> In YAML I've made the textual representation mirror the binary
> representation to a degree that the translation from YAML to binary has
> very little logic to it. As a point of context the YAML->DWARF
> implementation for dumping debug_abbrev, debug_str, and debug_aranges is
> under 100 lines of code.
>
>
>
>
>
> Large tests generated from other compilers on raw source I would expect to
> appear in something like the test-suite, rather than in an LLVM project's
> regression or unit test suite.
>
>
> Large tests will certainly not be included in the clang test suite. YAML
> representations of DWARF should enable us to make reduced test cases in
> many situations, and where we cannot we will put the test in an external
> suite.
>
>
> Why the need for round tripping, then? Would it be sufficient for the
> test-suite to have binaries checked in next to info about what compiler
> generated them?
>
>
> The benefit of supporting round tripping in and out of a text-based format
> is that we may be able to reduce the test cases to things that we can
> include in the Clang test suite.
>
> (& why not just have the source checked in & run a variety of buildbot
> configurations (or one meta-configuration that could enumerate a variety of
> compilers) with different host compilers to test the behavior? That's how
> GDB's test suite works (for better and worse, don't get me wrong - there
> are things that could be improved from that position))
>
>
> This is actually basically how the LLDB test suite works. There is one
> huge drawback to this. Not everyone has access to every compiler we want to
> support, and certainly most people don't have them all installed. As a
> result having source-based tests means that many people may not be able to
> reproduce test failures locally. Using YAML encodings to generate the
> binary DWARF removes the compiler from the picture, and allows everyone to
> test every compiler's output.
>
>
> Fair - so why YAML rather than something more like the unit tests Greg's
> working on in LLVM?
>
>
> I mostly gravitated to YAML because I have experience using YAML-based
> tests for libObject code, and have found it very useful to be able to
> translate binaries in and out of YAML for testing.
>
>
> (this is clearly my preference - to use the unit test type API, since in
> both Greg and your case, you're testing an API, not a tool, so it seems
> cool/fine/reasonable to have an API for generating the input.
>
>
> I actually expect in my use case that I'll be testing both APIs and one or
> more tools. My intention is to write a tool that reads dwarf and dumps
> Clang ASTs. For that purpose having a YAML->DWARF generator is ideal.
>
> Also for my use case YAML has an added advantage that when a user reports
> an issue I can either take a binary or YAML file from the user, and
> textually reduce that down to a test case which could live in-tree.
>
>
> But the alternative question would be: Why not test the LLVM DWARF parsing
> API Greg's testing, with this yaml input instead of the unit test API?)
>
>
> Personally, I think having both types of tests are valuable. Unit tests of
> APIs are particularly valuable for writing small-grained tests, with
> limited input sizes. When I start running down the path of constructing
> Clang ASTs from complex C++ programs the code required to generate that
> DWARF in a unit test could be substantial, and that would make it a lot
> harder to write tests.
>
> Converting a binary to a YAML file is easy, hand crafting DWARF from APIs
> might not be.
>
> -Chris
>
>
>
>
> -Chris
>
>
> - Dave
>
>
>
> -Chris
>
> On Dec 12, 2016, at 3:57 PM, David Blaikie via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> I realize work is already underway/being committed here, but figured
> discussing the following in this thread rather than on some random commit
> email might be better.
>
> We now have two ways of generating DWARF, both committed in relation to a
> similar effort to integrate LLDB better with teh rest of the LLVM project.
>
> There's this YAML effort, to help test the library that will allow the
> generation of Clang ASTs from DWARF. (currently such code resides in LLDB,
> and it's proposing to be rolled up into Clang here)
>
> Then there's Greg's effort to provide a unit test API for generating DWARF
> for unit testing LLVM's DWARF parsing APIs for use in LLDB (currently what
> LLVM has was a fork of LLDB's, and Greg's working on reconciling that,
> rolling in LLDB's post-fork features, then migrating LLDB to use the fully
> featured LLVM version)
>
> Why are these done in two different ways? They seem like really similar
> use cases - generating DWARF for the purpose of testing some (LLVM or
> Clang) API that consumes DWARF bytes.
>
> Could we resolve this in favor of one approach or the other - I'm somewhat
> partial to the API approach & writing unit tests against the
> ClangDebuggerSupport library, myself.
>
> - David
>
> On Wed, Nov 9, 2016 at 2:26 PM Chris Bieneman via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> Hello cfe-dev,
>
> I would like to propose a new Clang library for implementing functionality
> that is used by LLDB. I see this as the first step in a long process of
> refactoring the language interfaces for LLDB.
>
> The short-term goal is for this library is to be a place for us to rebuild
> functionality that exists in LLDB today and relies heavily on the
> implementation of Clang. As we rebuild the functionality we will build a
> suite of testing tools in Clang that exercise this library and more general
> Clang functionality in the same ways that LLDB will.
>
> As bits of functionality become fully implemented and tested, we will
> migrate LLDB to using the Clang implementations, allowing LLDB to remove
> its own copies. This will provide the Clang community with a higher
> confidence that changes in Clang do not break LLDB, and it will provide
> LLDB with better test coverage of the Clang functionality.
>
> The long-term goal of this library is to provide the implementation for
> what could some day become a defined debugger<->frontend interface for
> providing modularized (maybe even plugin-based) language debugging support
> in LLDB. In the distant future I could see us being able to tell people
> building new frontends that we have a defined interface they need to
> implement for the debugger, and once implemented the debugger should “Just
> Work”.
>
> The first bit of functionality that I would like to build up into the
> ClangDebuggerSupport library is materialization of Clang AST types from
> DWARF. To support this development I intend to add a new tool in Clang that
> reads DWARF types, generates a Clang AST, and prints the AST. I will also
> add DWARF support to obj2yaml and yaml2obj, so we will be able to write
> YAML LIT tests for the functionality.
>
> If people are in favor of this general approach I’ll begin working in this
> direction, and I’ll probably add the new library sometime next month.
>
> Thoughts?
> -Chris
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161213/e62308a1/attachment.html>