[lldb-dev] lldb: how can I make it reliably scriptable?

Mon Feb 16 14:41:49 PST 2015

Sorry for the length here.  TL;DR: is it possible to reliably script
LLDB non-interactively, or is this a known area of weakness in LLDB and
I should give up until LLDB gets more mature in this area?

We have a script in our test environment to perform a basic triage of
coredumps that happen during testing; when a test detects a core it runs
this script to get some info about the failure: we can use the backtrace
to categorize the core, etc.

On GNU/Linux it uses GDB to do some simple things like print some
important global variable values, show a full backtrace of all threads
(these programs often have 20+ threads active), show the environment,
etc.  We give this little script to GDB with the "-batch -x <file>"
commands, and it works perfectly every time.

Now we need to do the same thing for cores generated on Mac OSX (we're
using Xcode 5.1.1 but we've seen these same behaviors for older versions
as well).

At first we were using the old GDB 6.3.50.20050815-cvs which is the last
one supported by Apple.  This works fine if we have a full set of object
files, but if we use dsymutil to generate .dSYM and DON'T have the
object files, this version of GDB can't give us full backtraces; we get
the function names only: no line numbers, no arguments, no ability to
access global variables, etc.  Apparently it can't get full details from
the dSYM files.  However, at least it does what little it does reliably.

So we first tried to get the latest GDB with homebrew and use that, but
it simply does not work at all:

  $ gdb -c core.78225 dist/bin/myprog
  GNU gdb (GDB) 7.8.1
    ..
  "/Users/build/crash/core.78225": no core file handler recognizes format

Building GDB myself from source gives the same result.  So, we turned to
LLDB to try to get the same behavior... and it just does not work
reliably.

First, is there any sort of comprehensive manual for the LLDB CLI?  On
lldb.llvm.org I see a tutorial and a command comparison with GDB, but
nowhere can I find any sort of manual akin to the GDB manual that
describes all the LLDB CLI commands, what they do, etc... ?

We tried to use the LLDB -s option to provide a script file; very
simple:

  $ cat show.lldb
  target create --core core.78225 dist/bin/myprog
  thread backtrace all
  exit

Now, with some core files this works just fine.  However with other
cores, it fails completely:

  $ lldb --version
  lldb-310.2.37

  $ lldb -s show.lldb
    ..
  (lldb)  thread backtrace all
  error: Aborting reading of commands after command #1: 'thread backtrace all' failed with error: invalid thread
  Aborting after_file command execution, command file: 'show.lldb' failed.

If I do it interactively, exact same commands, exact same core, etc.
rather than via -s then it works every time.  But using -s it fails (for
some cores), every time.  Using "-o" multiple times to pass individual
commands instead of "-s" fails the same way.

Even more frustrating is that when we get the above error, LLDB stops
processing the script file and SITS AT THE PROMPT because it never
processes the "exit" command!!  This means our entire test suite hangs
right there.  Redirecting stdin from /dev/null doesn't fix this; it's
even worse (it prints out 100's of "quit" commands then hangs).
Obviously not acceptable.  There doesn't seem to be any LLDB equivalent
of GDB's "-batch" flag.

So then we gave up on the LLDB front-end and tried to write a Python
script to do the debugging that we want.  Even though we just want to
run a couple of simple commands, this took some effort for my colleague
to learn the Python API and ended up with 160 lines of Python, but he
got it working.

Sometimes it works, I should say.

Other times, Python itself aborts ("Abort trap: 6").  Even this is not
so bad, because we can just put the invocation of Python into a loop and
eventually it usually works (unlike the -s option to lldb above, this
one doesn't depend on the core file; the same core file will sometimes
have Python dump core and sometimes not).

However, other times the Python script just hangs forever and never
finishes; again this means the entire automated test suite hangs in this
situation.

At this point I don't feel like we have any alternative except to tell
people that there's no support for this capability on OSX and they'll
just have to debug all the cores by hand interactively, and we can't do
any automated categorization of cores based on backtraces, etc.

I'm checking the output OSX dumps into ~/Library/Logs/DiagnosticReports
which doesn't have everything we need but at least has a stack trace.