[llvm-dev] [RFC] Error handling in LLVM libraries.

Tue Feb 2 23:37:31 PST 2016

> The new RTTI system uses something closer to LLVM's Pass IDs...

I should stress that all of this is totally opaque to clients: all you have
to do to define your own error type is extend the TypedErrorInfo template,
as below for example, and it will take care of the RTTI for you:

// A minimal new error class:
class MyError : public TypedErrorInfo<MyError> {
};

// Subclassing 'MyError':
class MySubError : public TypedErrorInfo<MySubError, MyError> {
};

- Lang.

On Tue, Feb 2, 2016 at 11:33 PM, Lang Hames <lhames at gmail.com> wrote:

> Hi Mehdi,
>
> > If you subclass a diagnostic right now, isn’t the RTTI information
> available to the handler, which can then achieve the same dispatching /
> custom handling per type of diagnostic?
> > (I’m not advocating the diagnostic system, which I found less convenient
> to use than what you are proposing)
>
> I have to confess I haven't looked at the diagnostic classes closely. I'll
> take a look and get back to you on this one. :)
>
> > > For that you need RTTI, so this patch introduces a new RTTI scheme
> that I think is more suitable for errors types*, since unlike LLVM's
> existing RTTI system it doesn't require you to enumerate the types up-front.
> > It looks like I’m missing a piece of something as it is not clear why is
> this strictly needed. I may have to actually look closer at the code itself.
>
> LLVM's RTTI requires you to define an enum value for each type in your
> hierarchy (see http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html), which
> means you need to know about all your potential subclasses up-front. That's
> not an option for a generic error class that might be subclassed in
> arbitrary ways.
>
> The new RTTI system uses something closer to LLVM's Pass IDs:
> TypedErrorInfo (a utility which all errors must inherit from) introduces a
> new static char for each error type and uses its address as an ID. When you
> ask an error value "are you a subclass of type T" (via the isSameOrSubClass
> method) the call goes to the parent TypedErrorInfo object, which compares
> T's ID with its own. If it matches it returns true, if it doesn't match
> then the call gets forwarded to the parent class, then its parent class,
> and so on. If you hit the root of the type-tree (i.e. TypedErrorInfoBase)
> without matching the ID, then you weren't a subclass of T.
>
> > Sure, this case shows “success” of the handler, now what is a failure
> of the handler and how is it handled?
>
> Sorry - that was a bad example to choose: That was actually showcasing
> failure, not success. Success looks like this:
>
> TypedError bar() {
>   TypedError Err = foo;
>   if (auto E2 =
>     catchTypedErrors(std::move(Err),
>       handleTypedError<MyError>([&](std::unique_ptr<MyError> M) {
>         // Deal with 'M' somehow.
>         return TypedError();
>       }))
>     return E2;
>
>   // Proceed with 'bar' if 'Err' is handled.
> }
>
> A key observation is that catchTypedErrors itself returns an error. It has
> to, because you may not have provided it with an exhaustive list of
> handlers, and it needs a way to return unhanded errors. So: If no handler
> gets invoked, catchTypedErrors will just return 'Err' again. If 'Err' was
> an error, then E2 will also be an error, and you'll immediately return from
> 'bar', passing responsibility for the error up the stack. So far so good.
> Now consider what we should do if, instead, we *did* invoke a handler. One
> option would be to say that if a handler gets invoked then catchTypedErrors
> always returns 'TypedError()', indicating success, but that's an assertion
> that any error that's caught is definitely resolved. Sadly, we can't rely
> on that, so instead we allow the handler to supply the return value for
> catchTypedErrors. If the handler supplies 'TypedError()' then the error is
> considered truly 'handled' - E2 becomes TypedError(), the if condition is
> false (indicating there is no error) and the function goes on its way. If
> the handler supplies an error, then E2 becomes an error, the if condition
> is true, and we exit the function immediately, returning the error to the
> caller.
>
> Hope that clears things up a bit. It takes a bit of staring at the first
> time, but I find it helpful to think of it as being analogous to a
> 're-throw' in an exception handler: Returning success (i.e. TypedError())
> means continue the function, anything else means re-throw the error.
>
> Cheers,
> Lang.
>
> On Tue, Feb 2, 2016 at 10:56 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Feb 2, 2016, at 10:42 PM, Lang Hames <lhames at gmail.com> wrote:
>>
>> Hi Mehdi,
>>
>> > I’m not sure to understand this claim? You are supposed to be able to
>> extend and subclass the type of diagnostics? (I remember doing it for an
>> out-of-tree LLVM-based project).
>>
>> You can subclass diagnostics, but subclassing (on its own) only lets you
>> change the behaviour of the diagnostic/error itself. What we need, and what
>> this patch supplies, is a way to choose a particular handler based on the
>> type of the error.
>>
>>
>> If you subclass a diagnostic right now, isn’t the RTTI information
>> available to the handler, which can then achieve the same dispatching /
>> custom handling per type of diagnostic?
>> (I’m not advocating the diagnostic system, which I found less convenient
>> to use than what you are proposing)
>>
>> For that you need RTTI, so this patch introduces a new RTTI scheme that I
>> think is more suitable for errors types*, since unlike LLVM's existing RTTI
>> system it doesn't require you to enumerate the types up-front.
>>
>>
>> It looks like I’m missing a piece of something as it is not clear why is
>> this strictly needed. I may have to actually look closer at the code itself.
>>
>>
>> * If this RTTI system is considered generically useful it could be split
>> out into its own utility. It's slightly higher cost than LLVM's system: One
>> byte of BSS per type, and a walk from the dynamic type of the error to the
>> root of the type-hierarchy (with possible early exit) for each type check.
>>
>> > What does success or failure means for the handler?
>>
>> It gives the handler an opportunity to inspect and then "re-throw" an
>> error if necessary: A handler might not know whether it can recover based
>> on type alone, or it may not want to recover at all, but instead attach
>> some context to provide a richer diagnostic.
>>
>> As a concrete example, one of our motivating cases is processing object
>> files in archives. Down in the object file processing code, a load command
>> might be found to be malformed, but at that point there's no context to
>> tell us that the object that it's in is part of an archive, so the best
>> diagnostic we could produce is "In foo.o: malformed load command at index
>> N". A (straw-man) improved system might look like this:
>>
>> class ObjectError ... { // <- Root of all object-file errors
>>   std::string ArchiveName = "";
>>   std::string ObjectName = "";
>>   std::error_code EC;
>>
>>   void log(raw_ostream &OS) const override {
>>     if (!ArchiveName.empty())
>>       OS << "In archive '" << ArchiveName << "', ";
>>     OS << "In object file '" << ObjectName << "', " << EC.message();
>>   }
>> };
>>
>> TypedError processArchive(Archive &A) {
>>   TypedError Err;
>>   for (auto &Obj : A) {
>>     auto Err = processObject(Obj);
>>     if (auto E2 =
>>       catchTypedErrors(std::move(Err),
>>         handleTypedError<ObjectError>([&](std::unique_ptr<ObjectError>
>> OE) {
>>           OE->ArchiveName = A.getName();
>>           return TypedError(std::move(OE));
>>         }))
>>      return E2;
>>   }
>> }
>>
>> In this example, any error (whether an ObjectError or something else)
>> will be intercepted by the 'catchTypedErrors' function. If the error
>> *isn't* an ObjectError it'll be returned unmodified out of
>> catchTypedErrors, triggering an immediate return from processArchive. If it
>> *is* an ObjectError then the handler will be run, giving us an opportunity
>> to tag the error as having occurred within archive A.
>>
>> Again - this is a straw-man example: I think we can do better again for
>> diagnostics of this kind, but it showcases the value of being able to
>> modify errors while they're in-flight.
>>
>>
>> Sure, this case shows “success” of the handler, now what is a failure of
>> the handler and how is it handled?
>>
>>
>>
>> > Is your call to catchAllTypedErrors(…) actually like a switch on the
>> type of the error? What about a syntax that looks like a switch?
>> >
>> > switchErr(std::move(Err))
>> >    .case< MyCustomError>([] () { /* … */ })
>> >    .case< MyOtherCustomError>([] () { /* … */ })
>> >    .default([] () { /* … */ })
>>
>> It's similar to a switch, but it's actually more like a list of regular
>> C++ exception catch blocks (the name 'catchTypedError' is a nod to this).
>> The big difference is that you're not trying to find "the matching
>> handler" in the set of options. Instead, the list of handlers is evaluated
>> in order until one is found that fits, then that handler alone is executed.
>> So if you had the following:
>>
>> class MyBaseError : public TypedErrorInfo<MyBaseError> {};
>> class MyDerivedError : public TypedErrorInfo<MyDerivedError, MyBaseError>
>> {}; // <- MyDerivedError inherits from MyBaseError.
>>
>> and you wrote something like this:
>>
>> catchTypedErrors(std::move(Err),
>>   handleTypedError<MyBaseError>([&](std::unique_ptr<MyBaseError> B) {
>>
>>   }),
>>   handleTypedError<MyDerivedError>([&](std::unique_ptr<MyDerivedError> D)
>> {
>>
>>   })
>> );
>>
>>
>> The second handler will never run: All 'Derived' errors are 'Base'
>> errors, the first handler fits, so it's the one that will be run.
>>
>> We could go for something more like a switch, but then you have to define
>> the notion of "best fit" for a type, which might be difficult (especially
>> if I extend this to support multiple inheritance in error hierarchies. ;).
>> I think it's easier to reason about "first handler that fits”.
>>
>>
>> Oh I was seeing it as a “first match as well”, just bike shedding the
>> syntax as the function calls with a long flat list of lambdas as argument
>> didn’t seem like the best we can do at the first sight.
>>
>> —
>> Mehdi
>>
>>
>>
>> Cheers,
>> Lang.
>>
>>
>> On Tue, Feb 2, 2016 at 6:33 PM, Mehdi Amini <mehdi.amini at apple.com>
>> wrote:
>>
>>> Hi Lang,
>>>
>>> I’m glad someone tackle this long lived issue :)
>>> I’ve started to think about it recently but didn’t as far as you did!
>>>
>>> On Feb 2, 2016, at 5:29 PM, Lang Hames via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>> Hi All,
>>>
>>> I've been thinking lately about how to improve LLVM's error model and
>>> error reporting. A lack of good error reporting in Orc and MCJIT has forced
>>> me to spend a lot of time investigating hard-to-debug errors that could
>>> easily have been identified if we provided richer error information to the
>>> client, rather than just aborting. Kevin Enderby has made similar
>>> observations about the state of libObject and the difficulty of producing
>>> good error messages for damaged object files. I expect to encounter more
>>> issues like this as I continue work on the MachO side of LLD. I see
>>> tackling the error modeling problem as a first step towards improving error
>>> handling in general: if we make it easy to model errors, it may pave the
>>> way for better error handling in many parts of our libraries.
>>>
>>> At present in LLVM we model errors with std::error_code (and its helper,
>>> ErrorOr) and use diagnostic streams for error reporting. Neither of these
>>> seem entirely up to the job of providing a solid error-handling mechanism
>>> for library code. Diagnostic streams are great if all you want to do is
>>> report failure to the user and then terminate, but they can't be used to
>>> distinguish between different kinds of errors
>>>
>>>
>>> I’m not sure to understand this claim? You are supposed to be able to
>>> extend and subclass the type of diagnostics? (I remember doing it for an
>>> out-of-tree LLVM-based project).
>>>
>>>
>>> , and so are unsuited to many use-cases (especially error recovery). On
>>> the other hand, std::error_code allows error kinds to be distinguished, but
>>> suffers a number of drawbacks:
>>>
>>> 1. It carries no context: It tells you what went wrong, but not where or
>>> why, making it difficult to produce good diagnostics.
>>> 2. It's extremely easy to ignore or forget: instances can be silently
>>> dropped.
>>> 3. It's not especially debugger friendly: Most people call the
>>> error_code constructors directly for both success and failure values.
>>> Breakpoints have to be set carefully to avoid stopping when success values
>>> are constructed.
>>>
>>> In fairness to std::error_code, it has some nice properties too:
>>>
>>> 1. It's extremely lightweight.
>>> 2. It's explicit in the API (unlike exceptions).
>>> 3. It doesn't require C++ RTTI (a requirement for use in LLVM).
>>>
>>> To address these shortcomings I have prototyped a new error-handling
>>> scheme partially inspired by C++ exceptions. The aim was to preserve the
>>> performance and API visibility of std::error_code, while allowing users to
>>> define custom error classes and inheritance relationships between them. My
>>> hope is that library code could use this scheme to model errors in a
>>> meaningful way, allowing clients to inspect the error information and
>>> recover where possible, or provide a rich diagnostic when aborting.
>>>
>>> The scheme has three major "moving parts":
>>>
>>> 1. A new 'TypedError' class that can be used as a replacement for
>>> std::error_code. E.g.
>>>
>>> std::error_code foo();
>>>
>>> becomes
>>>
>>> TypedError foo();
>>>
>>> The TypedError class serves as a lightweight wrapper for the real error
>>> information (see (2)). It also contains a 'Checked' flag, initially set to
>>> false, that tracks whether the error has been handled or not. If a
>>> TypedError is ever destructed without being checked (or passed on to
>>> someone else) it will call std::terminate(). TypedError cannot be silently
>>> dropped.
>>>
>>>
>>> I really like the fact that not checking the error triggers an error
>>> (this is the "hard to misuse” part of API design IMO).
>>> You don’t mention it, but I’d rather see this “checked” flag compiled
>>> out with NDEBUG.
>>>
>>>
>>> 2. A utility class, TypedErrorInfo, for building error class hierarchies
>>> rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.
>>>
>>> // Define a new error type implicitly inheriting from TypedErrorInfoBase.
>>> class MyCustomError : public TypedErrorInfo<MyCustomError> {
>>> public:
>>>   // Custom error info.
>>> };
>>>
>>> // Define a subclass of MyCustomError.
>>> class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
>>> MyCustomError> {
>>> public:
>>>   // Extends MyCustomError, adds new members.
>>> };
>>>
>>> 3.  A set of utility functions that use the custom RTTI system to
>>> inspect and handle typed errors. For example 'catchAllTypedErrors' and
>>> 'handleTypedError' cooperate to handle error instances in a type-safe way:
>>>
>>> TypedError foo() {
>>>   if (SomeFailureCondition)
>>>     return make_typed_error<MyCustomError>();
>>> }
>>>
>>> TypedError Err = foo();
>>>
>>> catchAllTypedErrors(std::move(Err),
>>>   handleTypedError<MyCustomError>(
>>>     [](std::unique_ptr<MyCustomError> E) {
>>>       // Handle the error.
>>>       return TypedError(); // <- Indicate success from handler.
>>>
>>>
>>> What does success or failure means for the handler?
>>>
>>>
>>>     }
>>>   )
>>> );
>>>
>>>
>>> If your initial reaction is "Too much boilerplate!" I understand, but
>>> take comfort: (1) In the overwhelmingly common case of simply returning
>>> errors, the usage is identical to std::error_code:
>>>
>>> if (TypedError Err = foo())
>>>   return Err;
>>>
>>> and (2) the boilerplate for catching errors is usually easily contained
>>> in a handful of utility functions, and tends not to crowd the rest of your
>>> source code. My initial experiments with this scheme involved updating many
>>> source lines, but did not add much code at all beyond the new error classes
>>> that were introduced.
>>>
>>>
>>> I believe that this scheme addresses many of the shortcomings of
>>> std::error_code while maintaining the strengths:
>>>
>>> 1. Context - Custom error classes enable the user to attach as much
>>> contextual information as desired.
>>>
>>> 2. Difficult to drop - The 'checked' flag in TypedError ensures that it
>>> can't be dropped, it must be explicitly "handled", even if that only
>>> involves catching the error and doing nothing.
>>>
>>> 3. Debugger friendly - You can set a breakpoint on any custom error
>>> class's constructor to catch that error being created. Since the error
>>> class hierarchy is rooted you can break on
>>> TypedErrorInfoBase::TypedErrorInfoBase to catch any error being raised.
>>>
>>> 4. Lightweight - Because TypedError instances are just a pointer and a
>>> checked-bit, move-constructing it is very cheap. We may also want to
>>> consider ignoring the 'checked' bit in release mode, at which point
>>> TypedError should be as cheap as std::error_code.
>>>
>>>
>>> Oh here you mention compiling out the “checked” flag :)
>>>
>>>
>>> 5. Explicit - TypedError is represented explicitly in the APIs, the same
>>> as std::error_code.
>>>
>>> 6. Does not require C++ RTTI - The custom RTTI system does not rely on
>>> any standard C++ RTTI features.
>>>
>>> This scheme also has one attribute that I haven't seen in previous error
>>> handling systems (though my experience in this area is limited): Errors are
>>> not copyable, due to ownership semantics of TypedError. I think this
>>> actually neatly captures the idea that there is a chain of responsibility
>>> for dealing with any given error. Responsibility may be transferred (e.g.
>>> by returning it to a caller), but it cannot be duplicated as it doesn't
>>> generally make sense for multiple people to report or attempt to recover
>>> from the same error.
>>>
>>> I've tested this prototype out by threading it through the
>>> object-creation APIs of libObject and using custom error classes to report
>>> errors in MachO headers. My initial experience is that this has enabled
>>> much richer error messages than are possible with std::error_code.
>>>
>>> To enable interaction with APIs that still use std::error_code I have
>>> added a custom ECError class that wraps a std::error_code, and can be
>>> converted back to a std::error_code using the typedErrorToErrorCode
>>> function. For now, all custom error code classes should (and do, in the
>>> prototype) derive from this utility class. In my experiments, this has made
>>> it easy to thread TypedError selectively through parts of the API.
>>> Eventually my hope is that TypedError could replace std::error_code for
>>> user-facing APIs, at which point custom errors would no longer need to
>>> derive from ECError, and ECError could be relegated to a utility for
>>> interacting with other codebases that still use std::error_code.
>>>
>>> So - I look forward to hearing your thoughts. :)
>>>
>>>
>>> Is your call to catchAllTypedErrors(…) actually like a switch on the
>>> type of the error? What about a syntax that looks like a switch?
>>>
>>> switchErr(std::move(Err))
>>>     .case< MyCustomError>([] () { /* … */ })
>>>     .case< MyOtherCustomError>([] () { /* … */ })
>>>     .default([] () { /* … */ })
>>>
>>>
>>> —
>>> Mehdi
>>>
>>>
>>> Cheers,
>>> Lang.
>>>
>>> Attached files:
>>>
>>> typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
>>> anchor() method to lib/Support/ErrorHandling.cpp).
>>>
>>> error_demo.tgz - Stand-alone program demo'ing basic use of the
>>> TypedError API.
>>>
>>> libobject_typed_error_demo.patch - Threads TypedError through the
>>> binary-file creation methods (createBinary, createObjectFile, etc).
>>> Proof-of-concept for how TypedError can be integrated into an existing
>>> system.
>>>
>>> <typed_error.patch><error_demo.tgz>
>>> <thread_typederror_through_object_creation.patch>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160202/9f3aa188/attachment-0001.html>