[llvm-dev] [RFC] Error handling in LLVM libraries.
Lang Hames via llvm-dev
llvm-dev at lists.llvm.org
Tue Feb 2 22:42:04 PST 2016
Hi Mehdi,
> I’m not sure to understand this claim? You are supposed to be able to
extend and subclass the type of diagnostics? (I remember doing it for an
out-of-tree LLVM-based project).
You can subclass diagnostics, but subclassing (on its own) only lets you
change the behaviour of the diagnostic/error itself. What we need, and what
this patch supplies, is a way to choose a particular handler based on the
type of the error. For that you need RTTI, so this patch introduces a new
RTTI scheme that I think is more suitable for errors types*, since unlike
LLVM's existing RTTI system it doesn't require you to enumerate the types
up-front.
* If this RTTI system is considered generically useful it could be split
out into its own utility. It's slightly higher cost than LLVM's system: One
byte of BSS per type, and a walk from the dynamic type of the error to the
root of the type-hierarchy (with possible early exit) for each type check.
> What does success or failure means for the handler?
It gives the handler an opportunity to inspect and then "re-throw" an error
if necessary: A handler might not know whether it can recover based on type
alone, or it may not want to recover at all, but instead attach some
context to provide a richer diagnostic.
As a concrete example, one of our motivating cases is processing object
files in archives. Down in the object file processing code, a load command
might be found to be malformed, but at that point there's no context to
tell us that the object that it's in is part of an archive, so the best
diagnostic we could produce is "In foo.o: malformed load command at index
N". A (straw-man) improved system might look like this:
class ObjectError ... { // <- Root of all object-file errors
std::string ArchiveName = "";
std::string ObjectName = "";
std::error_code EC;
void log(raw_ostream &OS) const override {
if (!ArchiveName.empty())
OS << "In archive '" << ArchiveName << "', ";
OS << "In object file '" << ObjectName << "', " << EC.message();
}
};
TypedError processArchive(Archive &A) {
TypedError Err;
for (auto &Obj : A) {
auto Err = processObject(Obj);
if (auto E2 =
catchTypedErrors(std::move(Err),
handleTypedError<ObjectError>([&](std::unique_ptr<ObjectError> OE) {
OE->ArchiveName = A.getName();
return TypedError(std::move(OE));
}))
return E2;
}
}
In this example, any error (whether an ObjectError or something else) will
be intercepted by the 'catchTypedErrors' function. If the error *isn't* an
ObjectError it'll be returned unmodified out of catchTypedErrors,
triggering an immediate return from processArchive. If it *is* an
ObjectError then the handler will be run, giving us an opportunity to tag
the error as having occurred within archive A.
Again - this is a straw-man example: I think we can do better again for
diagnostics of this kind, but it showcases the value of being able to
modify errors while they're in-flight.
> Is your call to catchAllTypedErrors(…) actually like a switch on the type
of the error? What about a syntax that looks like a switch?
>
> switchErr(std::move(Err))
> .case< MyCustomError>([] () { /* … */ })
> .case< MyOtherCustomError>([] () { /* … */ })
> .default([] () { /* … */ })
It's similar to a switch, but it's actually more like a list of regular C++
exception catch blocks (the name 'catchTypedError' is a nod to this).
The big difference is that you're not trying to find "the matching handler"
in the set of options. Instead, the list of handlers is evaluated in order
until one is found that fits, then that handler alone is executed. So if
you had the following:
class MyBaseError : public TypedErrorInfo<MyBaseError> {};
class MyDerivedError : public TypedErrorInfo<MyDerivedError, MyBaseError>
{}; // <- MyDerivedError inherits from MyBaseError.
and you wrote something like this:
catchTypedErrors(std::move(Err),
handleTypedError<MyBaseError>([&](std::unique_ptr<MyBaseError> B) {
}),
handleTypedError<MyDerivedError>([&](std::unique_ptr<MyDerivedError> D) {
})
);
The second handler will never run: All 'Derived' errors are 'Base' errors,
the first handler fits, so it's the one that will be run.
We could go for something more like a switch, but then you have to define
the notion of "best fit" for a type, which might be difficult (especially
if I extend this to support multiple inheritance in error hierarchies. ;).
I think it's easier to reason about "first handler that fits".
Cheers,
Lang.
On Tue, Feb 2, 2016 at 6:33 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
> Hi Lang,
>
> I’m glad someone tackle this long lived issue :)
> I’ve started to think about it recently but didn’t as far as you did!
>
> On Feb 2, 2016, at 5:29 PM, Lang Hames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi All,
>
> I've been thinking lately about how to improve LLVM's error model and
> error reporting. A lack of good error reporting in Orc and MCJIT has forced
> me to spend a lot of time investigating hard-to-debug errors that could
> easily have been identified if we provided richer error information to the
> client, rather than just aborting. Kevin Enderby has made similar
> observations about the state of libObject and the difficulty of producing
> good error messages for damaged object files. I expect to encounter more
> issues like this as I continue work on the MachO side of LLD. I see
> tackling the error modeling problem as a first step towards improving error
> handling in general: if we make it easy to model errors, it may pave the
> way for better error handling in many parts of our libraries.
>
> At present in LLVM we model errors with std::error_code (and its helper,
> ErrorOr) and use diagnostic streams for error reporting. Neither of these
> seem entirely up to the job of providing a solid error-handling mechanism
> for library code. Diagnostic streams are great if all you want to do is
> report failure to the user and then terminate, but they can't be used to
> distinguish between different kinds of errors
>
>
> I’m not sure to understand this claim? You are supposed to be able to
> extend and subclass the type of diagnostics? (I remember doing it for an
> out-of-tree LLVM-based project).
>
>
> , and so are unsuited to many use-cases (especially error recovery). On
> the other hand, std::error_code allows error kinds to be distinguished, but
> suffers a number of drawbacks:
>
> 1. It carries no context: It tells you what went wrong, but not where or
> why, making it difficult to produce good diagnostics.
> 2. It's extremely easy to ignore or forget: instances can be silently
> dropped.
> 3. It's not especially debugger friendly: Most people call the error_code
> constructors directly for both success and failure values. Breakpoints have
> to be set carefully to avoid stopping when success values are constructed.
>
> In fairness to std::error_code, it has some nice properties too:
>
> 1. It's extremely lightweight.
> 2. It's explicit in the API (unlike exceptions).
> 3. It doesn't require C++ RTTI (a requirement for use in LLVM).
>
> To address these shortcomings I have prototyped a new error-handling
> scheme partially inspired by C++ exceptions. The aim was to preserve the
> performance and API visibility of std::error_code, while allowing users to
> define custom error classes and inheritance relationships between them. My
> hope is that library code could use this scheme to model errors in a
> meaningful way, allowing clients to inspect the error information and
> recover where possible, or provide a rich diagnostic when aborting.
>
> The scheme has three major "moving parts":
>
> 1. A new 'TypedError' class that can be used as a replacement for
> std::error_code. E.g.
>
> std::error_code foo();
>
> becomes
>
> TypedError foo();
>
> The TypedError class serves as a lightweight wrapper for the real error
> information (see (2)). It also contains a 'Checked' flag, initially set to
> false, that tracks whether the error has been handled or not. If a
> TypedError is ever destructed without being checked (or passed on to
> someone else) it will call std::terminate(). TypedError cannot be silently
> dropped.
>
>
> I really like the fact that not checking the error triggers an error (this
> is the "hard to misuse” part of API design IMO).
> You don’t mention it, but I’d rather see this “checked” flag compiled out
> with NDEBUG.
>
>
> 2. A utility class, TypedErrorInfo, for building error class hierarchies
> rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.
>
> // Define a new error type implicitly inheriting from TypedErrorInfoBase.
> class MyCustomError : public TypedErrorInfo<MyCustomError> {
> public:
> // Custom error info.
> };
>
> // Define a subclass of MyCustomError.
> class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
> MyCustomError> {
> public:
> // Extends MyCustomError, adds new members.
> };
>
> 3. A set of utility functions that use the custom RTTI system to inspect
> and handle typed errors. For example 'catchAllTypedErrors' and
> 'handleTypedError' cooperate to handle error instances in a type-safe way:
>
> TypedError foo() {
> if (SomeFailureCondition)
> return make_typed_error<MyCustomError>();
> }
>
> TypedError Err = foo();
>
> catchAllTypedErrors(std::move(Err),
> handleTypedError<MyCustomError>(
> [](std::unique_ptr<MyCustomError> E) {
> // Handle the error.
> return TypedError(); // <- Indicate success from handler.
>
>
> What does success or failure means for the handler?
>
>
> }
> )
> );
>
>
> If your initial reaction is "Too much boilerplate!" I understand, but take
> comfort: (1) In the overwhelmingly common case of simply returning errors,
> the usage is identical to std::error_code:
>
> if (TypedError Err = foo())
> return Err;
>
> and (2) the boilerplate for catching errors is usually easily contained in
> a handful of utility functions, and tends not to crowd the rest of your
> source code. My initial experiments with this scheme involved updating many
> source lines, but did not add much code at all beyond the new error classes
> that were introduced.
>
>
> I believe that this scheme addresses many of the shortcomings of
> std::error_code while maintaining the strengths:
>
> 1. Context - Custom error classes enable the user to attach as much
> contextual information as desired.
>
> 2. Difficult to drop - The 'checked' flag in TypedError ensures that it
> can't be dropped, it must be explicitly "handled", even if that only
> involves catching the error and doing nothing.
>
> 3. Debugger friendly - You can set a breakpoint on any custom error
> class's constructor to catch that error being created. Since the error
> class hierarchy is rooted you can break on
> TypedErrorInfoBase::TypedErrorInfoBase to catch any error being raised.
>
> 4. Lightweight - Because TypedError instances are just a pointer and a
> checked-bit, move-constructing it is very cheap. We may also want to
> consider ignoring the 'checked' bit in release mode, at which point
> TypedError should be as cheap as std::error_code.
>
>
> Oh here you mention compiling out the “checked” flag :)
>
>
> 5. Explicit - TypedError is represented explicitly in the APIs, the same
> as std::error_code.
>
> 6. Does not require C++ RTTI - The custom RTTI system does not rely on any
> standard C++ RTTI features.
>
> This scheme also has one attribute that I haven't seen in previous error
> handling systems (though my experience in this area is limited): Errors are
> not copyable, due to ownership semantics of TypedError. I think this
> actually neatly captures the idea that there is a chain of responsibility
> for dealing with any given error. Responsibility may be transferred (e.g.
> by returning it to a caller), but it cannot be duplicated as it doesn't
> generally make sense for multiple people to report or attempt to recover
> from the same error.
>
> I've tested this prototype out by threading it through the object-creation
> APIs of libObject and using custom error classes to report errors in MachO
> headers. My initial experience is that this has enabled much richer error
> messages than are possible with std::error_code.
>
> To enable interaction with APIs that still use std::error_code I have
> added a custom ECError class that wraps a std::error_code, and can be
> converted back to a std::error_code using the typedErrorToErrorCode
> function. For now, all custom error code classes should (and do, in the
> prototype) derive from this utility class. In my experiments, this has made
> it easy to thread TypedError selectively through parts of the API.
> Eventually my hope is that TypedError could replace std::error_code for
> user-facing APIs, at which point custom errors would no longer need to
> derive from ECError, and ECError could be relegated to a utility for
> interacting with other codebases that still use std::error_code.
>
> So - I look forward to hearing your thoughts. :)
>
>
> Is your call to catchAllTypedErrors(…) actually like a switch on the type
> of the error? What about a syntax that looks like a switch?
>
> switchErr(std::move(Err))
> .case< MyCustomError>([] () { /* … */ })
> .case< MyOtherCustomError>([] () { /* … */ })
> .default([] () { /* … */ })
>
>
> —
> Mehdi
>
>
> Cheers,
> Lang.
>
> Attached files:
>
> typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
> anchor() method to lib/Support/ErrorHandling.cpp).
>
> error_demo.tgz - Stand-alone program demo'ing basic use of the TypedError
> API.
>
> libobject_typed_error_demo.patch - Threads TypedError through the
> binary-file creation methods (createBinary, createObjectFile, etc).
> Proof-of-concept for how TypedError can be integrated into an existing
> system.
>
> <typed_error.patch><error_demo.tgz>
> <thread_typederror_through_object_creation.patch>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/ae910786/attachment.html>
More information about the llvm-dev
mailing list