[llvm-dev] Status of debuginfo-tests

Fri Sep 8 14:57:17 PDT 2017

You are making a pretty strong case.

There's a tactic used in some test subtrees that could help with the source-code-sharing part, which is to keep the shared stuff in an Inputs subdirectory and then have a lit test script in the parent directory with a .test suffix.  This lets you do stuff like:

$ cat ./Inputs/foo.c
int main() {}
$ cat ./foo-1.test
REQUIRES: win32
RUN: %clangcl /whatever %S/Inputs/foo.c
$ cat ./foo-2.test
UNSUPPORTED: win32
RUN: %clang –gdwarf-2 %S/Inputs/foo.c

and so forth, so the actual C/C++ source is shared but the scripting can be tuned as needed.  There's a trivial amount of lit magic to make lit ignore the Inputs subdirectory and run files with .test suffixes.  This should make it easier to have separate Windows and gdb-like scripting.  In fact you could even put the scripts in debugger-specific directories and work out lit magic to run only the relevant subdirectory's tests, which lets you avoid repeating the REQUIRES/UNSUPPORTED stuff everywhere.  We do architecture-specific test subdirectories this way, for example.
--paulr

From: Zachary Turner [mailto:zturner at google.com]
Sent: Friday, September 08, 2017 11:37 AM
To: Robinson, Paul; Adrian Prantl
Cc: David Blaikie; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Status of debuginfo-tests

On Fri, Sep 8, 2017 at 10:46 AM Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote:
Let me say up front that I sympathize deeply with the problem; debug
info is an interface, and it is frequently unclear whether the goal of
some bit of work is to test the producer or test the consumer of the
interface.  In fact we end up using the producer to test the consumer,
and (in the case at hand) using the consumer to test the producer.
There are distinct analogies to testing compilers by seeing what the
linker thinks, and testing linkers by seeing whether they can handle
what the compiler produces.

> I understand the desire to keep them as similar as possible, but I'm
> still not really sold that massaging fickle text output into a different
> text format is going to make things more scalable.  I'd like there to be
> as few layers of text processing as possible.

If the text output is fickle, then I'd think hiding the fickleness behind
a wrapper we control would be preferable to updating dozens of tests when
something changes.  Or if a debugger changes its presentation in version
N+1, but people are still running tests with version N, persuading the
wrapper to handle both would be less overall work than making every test
accommodate both formats.
But that's just my point.  There are clearly going to be tests where both formats don't even make sense because it's testing something specific to one debugger.  What if I want to test that we output correct exception information, so I send a .exr command to the debugger and get back this:

0:000> .exr -1

ExceptionAddress: 77a6db8b (ntdll!LdrpDoDebuggerBreak+0x0000002b)

   ExceptionCode: 80000003 (Break instruction exception)

  ExceptionFlags: 00000000

NumberParameters: 1

   Parameter[0]: 00000000

 What if I want to test that that the debugger can print a valid stack trace, so I send a kv command and get back this?

 # ChildEBP RetAddr  Args to Child

00 0198fa4c 77a2f5ca 55fe0b87 00000000 00000000 ntdll!LdrpDoDebuggerBreak+0x2b (FPO: [Non-Fpo])

01 0198fc8c 77a18a42 55fe0bef 00000000 00000000 ntdll!LdrpInitializeProcess+0x1967 (FPO: [Non-Fpo])

02 0198fce4 77a1886c 00000000 bad81aba 00000000 ntdll!_LdrpInitialize+0x180 (FPO: [Non-Fpo])

03 0198fcf4 00000000 0198fd08 779c0000 00000000 ntdll!LdrInitializeThunk+0x1c (FPO: [Non-Fpo])

Whereas GDB would print something like

#0  m4_traceon (obs=0x24eb0, argc=1, argv=0x2b8c8)

    at builtin.c:993

#1  0x6e38 in expand_macro (sym=0x2b600) at macro.c:242

#2  0x6840 in expand_token (obs=0x0, t=177664, td=0xf7fffb08)

    at macro.c:71

(More stack frames follow...)

I really don't want to get in the business of trying to convert the first format into the second format.  Not only is it a recipe for disaster, but it leads to worse diagnostics.  When my CHECK statement fails, I can't even see the original stack trace anymore, only a generic error message like "could not parse stack trace"

> I also expect that on Windows we will end up having far more debug info
> tests than other platforms, specifically because we don't have the
> ability to write tests against the debugger (as it's proprietary /
> closed source).

I don't see how that follows. Sony runs the GDB test suite using clang as
the compiler, and while that is certainly perverting a debugger test suite
into being a compiler test suite, it has value in being a body of tests
that exercise a variety of debug-info features.  GDB being open source is
completely irrelevant to this use-case.  We treat it as closed.  We have
local changes to the test suite, but not to GDB.  The expected results
from the suite are based on GDB+GCC, which we treat as an oracle; then we
don't bother with tests of debugger features that clearly don't depend on
debug info, such as thread handling.
GDB being open source is very relevant to this case, because it means you *have* GDB's test suite.  We don't have WinDbg or Visual Studio debugger's test suite.

Whatever CodeView/PDB tests you want to write, you can use MS tools as
your oracle.  Maybe you can't leverage an existing test suite, but it
doesn't mean you can't write tests.
Right, it just means we will end up writing plenty of tests that test specific features of the debugger, something that would normally be handled in a debugger test suite, which we don't have.

> I don't see a useful abstraction that glosses over these differences
> that isn't a ton of work for minimal gain, given the frequency with
> which we'd need to fall back to a custom test anyway.

As I mentioned above, you seem to be heading in the direction of a
completely separate project, rather than being able to usefully
leverage anything from debuginfo-tests other than the basic idea.
--paulr

I don't entirely disagree with this assessment.  On the other hand, I don't see any reason to call it something other than "debuginfo-tests" or to put it somewhere else, since conceptually both things are the same.

Even in this case though, reusing the source code of the tests seems like a clear win since the high level ideas behind a test case often transcend consumer boundaries, even when the implementation doesn't.

Plus, there is more to be gained from sharing than just the tests themselves.  For example, I'm trying to get debuginfo-tests working properly with CMake in more idiomatic LLVM style.  If I go off and fork the tests into an entirely separate project, we wouldn't have that shared benefit.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170908/358913ec/attachment.html>