[llvm-dev] [RFC] Error handling in LLVM libraries.

Tue Feb 2 17:29:45 PST 2016

Hi All,

I've been thinking lately about how to improve LLVM's error model and error
reporting. A lack of good error reporting in Orc and MCJIT has forced me to
spend a lot of time investigating hard-to-debug errors that could easily
have been identified if we provided richer error information to the client,
rather than just aborting. Kevin Enderby has made similar observations
about the state of libObject and the difficulty of producing good error
messages for damaged object files. I expect to encounter more issues like
this as I continue work on the MachO side of LLD. I see tackling the error
modeling problem as a first step towards improving error handling in
general: if we make it easy to model errors, it may pave the way for better
error handling in many parts of our libraries.

At present in LLVM we model errors with std::error_code (and its helper,
ErrorOr) and use diagnostic streams for error reporting. Neither of these
seem entirely up to the job of providing a solid error-handling mechanism
for library code. Diagnostic streams are great if all you want to do is
report failure to the user and then terminate, but they can't be used to
distinguish between different kinds of errors, and so are unsuited to many
use-cases (especially error recovery). On the other hand, std::error_code
allows error kinds to be distinguished, but suffers a number of drawbacks:

1. It carries no context: It tells you what went wrong, but not where or
why, making it difficult to produce good diagnostics.
2. It's extremely easy to ignore or forget: instances can be silently
dropped.
3. It's not especially debugger friendly: Most people call the error_code
constructors directly for both success and failure values. Breakpoints have
to be set carefully to avoid stopping when success values are constructed.

In fairness to std::error_code, it has some nice properties too:

1. It's extremely lightweight.
2. It's explicit in the API (unlike exceptions).
3. It doesn't require C++ RTTI (a requirement for use in LLVM).

To address these shortcomings I have prototyped a new error-handling scheme
partially inspired by C++ exceptions. The aim was to preserve the
performance and API visibility of std::error_code, while allowing users to
define custom error classes and inheritance relationships between them. My
hope is that library code could use this scheme to model errors in a
meaningful way, allowing clients to inspect the error information and
recover where possible, or provide a rich diagnostic when aborting.

The scheme has three major "moving parts":

1. A new 'TypedError' class that can be used as a replacement for
std::error_code. E.g.

std::error_code foo();

becomes

TypedError foo();

The TypedError class serves as a lightweight wrapper for the real error
information (see (2)). It also contains a 'Checked' flag, initially set to
false, that tracks whether the error has been handled or not. If a
TypedError is ever destructed without being checked (or passed on to
someone else) it will call std::terminate(). TypedError cannot be silently
dropped.

2. A utility class, TypedErrorInfo, for building error class hierarchies
rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.

// Define a new error type implicitly inheriting from TypedErrorInfoBase.
class MyCustomError : public TypedErrorInfo<MyCustomError> {
public:
  // Custom error info.
};

// Define a subclass of MyCustomError.
class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
MyCustomError> {
public:
  // Extends MyCustomError, adds new members.
};

3.  A set of utility functions that use the custom RTTI system to inspect
and handle typed errors. For example 'catchAllTypedErrors' and
'handleTypedError' cooperate to handle error instances in a type-safe way:

TypedError foo() {
  if (SomeFailureCondition)
    return make_typed_error<MyCustomError>();
}

TypedError Err = foo();

catchAllTypedErrors(std::move(Err),
  handleTypedError<MyCustomError>(
    [](std::unique_ptr<MyCustomError> E) {
      // Handle the error.
      return TypedError(); // <- Indicate success from handler.
    }
  )
);

If your initial reaction is "Too much boilerplate!" I understand, but take
comfort: (1) In the overwhelmingly common case of simply returning errors,
the usage is identical to std::error_code:

if (TypedError Err = foo())
  return Err;

and (2) the boilerplate for catching errors is usually easily contained in
a handful of utility functions, and tends not to crowd the rest of your
source code. My initial experiments with this scheme involved updating many
source lines, but did not add much code at all beyond the new error classes
that were introduced.

I believe that this scheme addresses many of the shortcomings of
std::error_code while maintaining the strengths:

1. Context - Custom error classes enable the user to attach as much
contextual information as desired.

2. Difficult to drop - The 'checked' flag in TypedError ensures that it
can't be dropped, it must be explicitly "handled", even if that only
involves catching the error and doing nothing.

3. Debugger friendly - You can set a breakpoint on any custom error class's
constructor to catch that error being created. Since the error class
hierarchy is rooted you can break on TypedErrorInfoBase::TypedErrorInfoBase
to catch any error being raised.

4. Lightweight - Because TypedError instances are just a pointer and a
checked-bit, move-constructing it is very cheap. We may also want to
consider ignoring the 'checked' bit in release mode, at which point
TypedError should be as cheap as std::error_code.

5. Explicit - TypedError is represented explicitly in the APIs, the same as
std::error_code.

6. Does not require C++ RTTI - The custom RTTI system does not rely on any
standard C++ RTTI features.

This scheme also has one attribute that I haven't seen in previous error
handling systems (though my experience in this area is limited): Errors are
not copyable, due to ownership semantics of TypedError. I think this
actually neatly captures the idea that there is a chain of responsibility
for dealing with any given error. Responsibility may be transferred (e.g.
by returning it to a caller), but it cannot be duplicated as it doesn't
generally make sense for multiple people to report or attempt to recover
from the same error.

I've tested this prototype out by threading it through the object-creation
APIs of libObject and using custom error classes to report errors in MachO
headers. My initial experience is that this has enabled much richer error
messages than are possible with std::error_code.

To enable interaction with APIs that still use std::error_code I have added
a custom ECError class that wraps a std::error_code, and can be converted
back to a std::error_code using the typedErrorToErrorCode function. For
now, all custom error code classes should (and do, in the prototype) derive
from this utility class. In my experiments, this has made it easy to thread
TypedError selectively through parts of the API. Eventually my hope is that
TypedError could replace std::error_code for user-facing APIs, at which
point custom errors would no longer need to derive from ECError, and
ECError could be relegated to a utility for interacting with other
codebases that still use std::error_code.

So - I look forward to hearing your thoughts. :)

Cheers,
Lang.

Attached files:

typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
anchor() method to lib/Support/ErrorHandling.cpp).

error_demo.tgz - Stand-alone program demo'ing basic use of the TypedError
API.

libobject_typed_error_demo.patch - Threads TypedError through the
binary-file creation methods (createBinary, createObjectFile, etc).
Proof-of-concept for how TypedError can be integrated into an existing
system.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/d9a1ecba/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: typed_error.patch
Type: application/octet-stream
Size: 22737 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/d9a1ecba/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: error_demo.tgz
Type: application/x-gzip
Size: 1285 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/d9a1ecba/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thread_typederror_through_object_creation.patch
Type: application/octet-stream
Size: 104511 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/d9a1ecba/attachment-0003.obj>