[llvm-dev] [RFC] Error handling in LLVM libraries.

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 2 17:55:40 PST 2016


----- Original Message -----
> From: "Lang Hames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "LLVM Developers Mailing List" <llvm-dev at lists.llvm.org>
> Sent: Tuesday, February 2, 2016 7:29:45 PM
> Subject: [llvm-dev] [RFC] Error handling in LLVM libraries.
> 
> 
> 
> Hi All,
> 
> 
> I've been thinking lately about how to improve LLVM's error model and
> error reporting. A lack of good error reporting in Orc and MCJIT has
> forced me to spend a lot of time investigating hard-to-debug errors
> that could easily have been identified if we provided richer error
> information to the client, rather than just aborting. Kevin Enderby
> has made similar observations about the state of libObject and the
> difficulty of producing good error messages for damaged object
> files. I expect to encounter more issues like this as I continue
> work on the MachO side of LLD. I see tackling the error modeling
> problem as a first step towards improving error handling in general:
> if we make it easy to model errors, it may pave the way for better
> error handling in many parts of our libraries.
> 
> 
> At present in LLVM we model errors with std::error_code (and its
> helper, ErrorOr) and use diagnostic streams for error reporting.
> Neither of these seem entirely up to the job of providing a solid
> error-handling mechanism for library code. Diagnostic streams are
> great if all you want to do is report failure to the user and then
> terminate, but they can't be used to distinguish between different
> kinds of errors, and so are unsuited to many use-cases (especially
> error recovery). On the other hand, std::error_code allows error
> kinds to be distinguished, but suffers a number of drawbacks:
> 
> 
> 1. It carries no context: It tells you what went wrong, but not where
> or why, making it difficult to produce good diagnostics.

Generically, I like this idea.

However, regarding context, I wonder about the best model. When we designed the diagnostic reporting interface (by which I mean the bits in include/llvm/IR/DiagnosticInfo.h and include/llvm/IR/DiagnosticPrinter.h), the ability to carry context was very important. There, however, because the objects are being passed via a callback to the user-installed handler, they can carry pointers/references to objects (Values, Functions, etc.) that will go away once the object that detected the error is destroyed. In the model you're proposing, all of the context must be contained within the error object itself (because, by the time the context is useful, an arbitrary amount of the call stack to the error-detection point has already been unwound). This greatly limits the amount of information that can be efficiently stored as context in the error object. Depending on the use cases, it might be better to pass the context to some kind of error-handler callback than to try to pack it all into a long-lived error object. Thoughts?

Thanks again,
Hal

> 2. It's extremely easy to ignore or forget: instances can be silently
> dropped.
> 3. It's not especially debugger friendly: Most people call the
> error_code constructors directly for both success and failure
> values. Breakpoints have to be set carefully to avoid stopping when
> success values are constructed.
> 
> 
> In fairness to std::error_code, it has some nice properties too:
> 
> 
> 1. It's extremely lightweight.
> 2. It's explicit in the API (unlike exceptions).
> 3. It doesn't require C++ RTTI (a requirement for use in LLVM).
> 
> 
> To address these shortcomings I have prototyped a new error-handling
> scheme partially inspired by C++ exceptions. The aim was to preserve
> the performance and API visibility of std::error_code, while
> allowing users to define custom error classes and inheritance
> relationships between them. My hope is that library code could use
> this scheme to model errors in a meaningful way, allowing clients to
> inspect the error information and recover where possible, or provide
> a rich diagnostic when aborting.
> 
> 
> The scheme has three major "moving parts":
> 
> 
> 1. A new 'TypedError' class that can be used as a replacement for
> std::error_code. E.g.
> 
> 
> std::error_code foo();
> 
> 
> becomes
> 
> 
> TypedError foo();
> 
> 
> The TypedError class serves as a lightweight wrapper for the real
> error information (see (2)). It also contains a 'Checked' flag,
> initially set to false, that tracks whether the error has been
> handled or not. If a TypedError is ever destructed without being
> checked (or passed on to someone else) it will call
> std::terminate(). TypedError cannot be silently dropped.
> 
> 
> 2. A utility class, TypedErrorInfo, for building error class
> hierarchies rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.
> 
> 
> // Define a new error type implicitly inheriting from
> TypedErrorInfoBase.
> class MyCustomError : public TypedErrorInfo<MyCustomError> {
> public:
> // Custom error info.
> };
> 
> 
> // Define a subclass of MyCustomError.
> class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
> MyCustomError> {
> public:
> // Extends MyCustomError, adds new members.
> };
> 
> 
> 3. A set of utility functions that use the custom RTTI system to
> inspect and handle typed errors. For example 'catchAllTypedErrors'
> and 'handleTypedError' cooperate to handle error instances in a
> type-safe way:
> 
> 
> TypedError foo() {
> if (SomeFailureCondition)
> return make_typed_error<MyCustomError>();
> }
> 
> 
> TypedError Err = foo();
> 
> 
> catchAllTypedErrors(std::move(Err),
> handleTypedError<MyCustomError>(
> [](std::unique_ptr<MyCustomError> E) {
> // Handle the error.
> return TypedError(); // <- Indicate success from handler.
> }
> )
> );
> 
> 
> 
> 
> If your initial reaction is "Too much boilerplate!" I understand, but
> take comfort: (1) In the overwhelmingly common case of simply
> returning errors, the usage is identical to std::error_code:
> 
> 
> if (TypedError Err = foo())
> return Err;
> 
> 
> and (2) the boilerplate for catching errors is usually easily
> contained in a handful of utility functions, and tends not to crowd
> the rest of your source code. My initial experiments with this
> scheme involved updating many source lines, but did not add much
> code at all beyond the new error classes that were introduced.
> 
> 
> 
> 
> I believe that this scheme addresses many of the shortcomings of
> std::error_code while maintaining the strengths:
> 
> 
> 1. Context - Custom error classes enable the user to attach as much
> contextual information as desired.
> 
> 
> 2. Difficult to drop - The 'checked' flag in TypedError ensures that
> it can't be dropped, it must be explicitly "handled", even if that
> only involves catching the error and doing nothing.
> 
> 
> 3. Debugger friendly - You can set a breakpoint on any custom error
> class's constructor to catch that error being created. Since the
> error class hierarchy is rooted you can break on
> TypedErrorInfoBase::TypedErrorInfoBase to catch any error being
> raised.
> 
> 
> 4. Lightweight - Because TypedError instances are just a pointer and
> a checked-bit, move-constructing it is very cheap. We may also want
> to consider ignoring the 'checked' bit in release mode, at which
> point TypedError should be as cheap as std::error_code.
> 
> 
> 5. Explicit - TypedError is represented explicitly in the APIs, the
> same as std::error_code.
> 
> 
> 6. Does not require C++ RTTI - The custom RTTI system does not rely
> on any standard C++ RTTI features.
> 
> 
> This scheme also has one attribute that I haven't seen in previous
> error handling systems (though my experience in this area is
> limited): Errors are not copyable, due to ownership semantics of
> TypedError. I think this actually neatly captures the idea that
> there is a chain of responsibility for dealing with any given error.
> Responsibility may be transferred (e.g. by returning it to a
> caller), but it cannot be duplicated as it doesn't generally make
> sense for multiple people to report or attempt to recover from the
> same error.
> 
> 
> I've tested this prototype out by threading it through the
> object-creation APIs of libObject and using custom error classes to
> report errors in MachO headers. My initial experience is that this
> has enabled much richer error messages than are possible with
> std::error_code.
> 
> 
> To enable interaction with APIs that still use std::error_code I have
> added a custom ECError class that wraps a std::error_code, and can
> be converted back to a std::error_code using the
> typedErrorToErrorCode function. For now, all custom error code
> classes should (and do, in the prototype) derive from this utility
> class. In my experiments, this has made it easy to thread TypedError
> selectively through parts of the API. Eventually my hope is that
> TypedError could replace std::error_code for user-facing APIs, at
> which point custom errors would no longer need to derive from
> ECError, and ECError could be relegated to a utility for interacting
> with other codebases that still use std::error_code.
> 
> 
> So - I look forward to hearing your thoughts. :)
> 
> 
> Cheers,
> Lang.
> 
> 
> 
> Attached files:
> 
> 
> 
> typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
> anchor() method to lib/Support/ErrorHandling.cpp).
> 
> 
> error_demo.tgz - Stand-alone program demo'ing basic use of the
> TypedError API.
> 
> 
> libobject_typed_error_demo.patch - Threads TypedError through the
> binary-file creation methods (createBinary, createObjectFile, etc).
> Proof-of-concept for how TypedError can be integrated into an
> existing system.
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-dev mailing list