[llvm-dev] [RFC] Error handling in LLVM libraries.

Wed Feb 3 11:32:26 PST 2016

Hi Mehdi,

> Side question on the related work because I’m curious: did you look for
similar generic scheme for error handling offered by other C++ libraries?
Maybe the common scheme is to use C++ exceptions but since many folks
disable them (hello Google ;-) ) there has been probably many other
attempts to address this.

I did look around, but not very hard. Among the people I asked, everyone
was either using exceptions, std::error_code, or had turned on the
diagnostic that James Knight suggested. I did some preliminary
web-searching, but that didn't turn up anything interesting. If anybody has
seen other error handling schemes of note I'd love to hear about them.

> On this topic of not paying the price on the non-error code path, it
would be nice to not pay for constructing all the lambda captures when
there is no error.

If your catchTypedErrors call is under an if-test then the lambdas are in a
scope that won't be entered on the non-error path:

if (auto Err = foo())
  if (auto E2 = catchTypedErrors(std::move(Errs), <lambdas here>))
    return;

I think (though I haven't tested this) that most lambdas should inline away
to next to nothing, so I don't expect the overhead to be noticeable in
either case.

- Lang.

On Wed, Feb 3, 2016 at 10:55 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Feb 3, 2016, at 10:48 AM, Lang Hames <lhames at gmail.com> wrote:
>
> Hi Mehdi,
>
> > For a generic error class it is not an option indeed, but I was looking
> at it in the context of LLVM internal use, so just like our RTTI is not an
> option for “generic RTTI” but fits our need, we could (not should) do the
> same with ErrorHandling.
>
> Definitely. If this was LLVM only there'd be a strong case for using the
> existing RTTI system. The reason for the new RTTI system is that it would
> let us re-use this error class in other LLVM projects (LLD, LLDB, etc)
> without having to enumerate their errors in LLVM.
>
>
> I think it is great :)
>
> Side question on the related work because I’m curious: did you look for
> similar generic scheme for error handling offered by other C++ libraries?
> Maybe the common scheme is to use C++ exceptions but since many folks
> disable them (hello Google ;-) ) there has been probably many other
> attempts to address this.
>
>
> > Nice, and since this is on the error path we don’t care if it is not
> “as fast as” the custom LLVM RTTI.
>
> Yep.
>
>
> On this topic of not paying the price on the non-error code path, it would
> be nice to not pay for constructing all the lambda captures when there is
> no error. I can imagine many way of expressing a “try/catch” like syntax to
> achieve this using macros, but not really without that.
> Have you thought about this?
>
> Thanks,
>
> Mehdi
>
>
>
> > OK got it now, the “empty” TypedError()is the key :)
> > (and I was using success/failure terminology reversed compare to you)
>
> Yeah - this is confusing. It's no worse than 'std::error_code()', but it's
> no better either. Maybe introducing a utility wrapper like
> 'TypedErrorSuccess()' or even 'ESuccess()' would make things more readable.
>
> - Lang.
>
>
> On Wed, Feb 3, 2016 at 8:14 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>>
>> On Feb 2, 2016, at 11:33 PM, Lang Hames <lhames at gmail.com> wrote:
>>
>> Hi Mehdi,
>>
>> > If you subclass a diagnostic right now, isn’t the RTTI information
>> available to the handler, which can then achieve the same dispatching /
>> custom handling per type of diagnostic?
>> > (I’m not advocating the diagnostic system, which I found less
>> convenient to use than what you are proposing)
>>
>> I have to confess I haven't looked at the diagnostic classes closely.
>> I'll take a look and get back to you on this one. :)
>>
>> > > For that you need RTTI, so this patch introduces a new RTTI scheme
>> that I think is more suitable for errors types*, since unlike LLVM's
>> existing RTTI system it doesn't require you to enumerate the types up-front.
>> > It looks like I’m missing a piece of something as it is not clear why
>> is this strictly needed. I may have to actually look closer at the code
>> itself.
>>
>> LLVM's RTTI requires you to define an enum value for each type in your
>> hierarchy (see http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html), which
>> means you need to know about all your potential subclasses up-front. That's
>> not an option for a generic error class that might be subclassed in
>> arbitrary ways.
>>
>>
>> For a generic error class it is not an option indeed, but I was looking
>> at it in the context of LLVM internal use, so just like our RTTI is not an
>> option for “generic RTTI” but fits our need, we could (not should) do the
>> same with ErrorHandling.
>>
>>
>>
>> The new RTTI system uses something closer to LLVM's Pass IDs:
>> TypedErrorInfo (a utility which all errors must inherit from) introduces a
>> new static char for each error type and uses its address as an ID. When you
>> ask an error value "are you a subclass of type T" (via the isSameOrSubClass
>> method) the call goes to the parent TypedErrorInfo object, which compares
>> T's ID with its own. If it matches it returns true, if it doesn't match
>> then the call gets forwarded to the parent class, then its parent class,
>> and so on. If you hit the root of the type-tree (i.e. TypedErrorInfoBase)
>> without matching the ID, then you weren't a subclass of T.
>>
>>
>> Nice, and since this is on the error path we don’t care if it is not “as
>> fast as” the custom LLVM RTTI.
>>
>>
>> > Sure, this case shows “success” of the handler, now what is a failure
>> of the handler and how is it handled?
>>
>> Sorry - that was a bad example to choose: That was actually showcasing
>> failure, not success. Success looks like this:
>>
>> TypedError bar() {
>>   TypedError Err = foo;
>>   if (auto E2 =
>>     catchTypedErrors(std::move(Err),
>>       handleTypedError<MyError>([&](std::unique_ptr<MyError> M) {
>>         // Deal with 'M' somehow.
>>         return TypedError();
>>       }))
>>     return E2;
>>
>>   // Proceed with 'bar' if 'Err' is handled.
>> }
>>
>> A key observation is that catchTypedErrors itself returns an error. It
>> has to, because you may not have provided it with an exhaustive list of
>> handlers, and it needs a way to return unhanded errors. So: If no handler
>> gets invoked, catchTypedErrors will just return 'Err' again. If 'Err' was
>> an error, then E2 will also be an error, and you'll immediately return from
>> 'bar', passing responsibility for the error up the stack. So far so good.
>> Now consider what we should do if, instead, we *did* invoke a handler. One
>> option would be to say that if a handler gets invoked then catchTypedErrors
>> always returns 'TypedError()', indicating success, but that's an assertion
>> that any error that's caught is definitely resolved. Sadly, we can't rely
>> on that, so instead we allow the handler to supply the return value for
>> catchTypedErrors. If the handler supplies 'TypedError()' then the error is
>> considered truly 'handled' - E2 becomes TypedError(), the if condition is
>> false (indicating there is no error) and the function goes on its way. If
>> the handler supplies an error, then E2 becomes an error, the if condition
>> is true, and we exit the function immediately, returning the error to the
>> caller.
>>
>> Hope that clears things up a bit. It takes a bit of staring at the first
>> time, but I find it helpful to think of it as being analogous to a
>> 're-throw' in an exception handler: Returning success (i.e. TypedError())
>> means continue the function, anything else means re-throw the error.
>>
>>
>> OK got it now, the “empty” TypedError()is the key :)
>> (and I was using success/failure terminology reversed compare to you)
>>
>> —
>> Mehdi
>>
>>
>>
>> Cheers,
>> Lang.
>>
>> On Tue, Feb 2, 2016 at 10:56 PM, Mehdi Amini <mehdi.amini at apple.com>
>> wrote:
>>
>>>
>>> On Feb 2, 2016, at 10:42 PM, Lang Hames <lhames at gmail.com> wrote:
>>>
>>> Hi Mehdi,
>>>
>>> > I’m not sure to understand this claim? You are supposed to be able to
>>> extend and subclass the type of diagnostics? (I remember doing it for an
>>> out-of-tree LLVM-based project).
>>>
>>> You can subclass diagnostics, but subclassing (on its own) only lets you
>>> change the behaviour of the diagnostic/error itself. What we need, and what
>>> this patch supplies, is a way to choose a particular handler based on the
>>> type of the error.
>>>
>>>
>>> If you subclass a diagnostic right now, isn’t the RTTI information
>>> available to the handler, which can then achieve the same dispatching /
>>> custom handling per type of diagnostic?
>>> (I’m not advocating the diagnostic system, which I found less convenient
>>> to use than what you are proposing)
>>>
>>> For that you need RTTI, so this patch introduces a new RTTI scheme that
>>> I think is more suitable for errors types*, since unlike LLVM's existing
>>> RTTI system it doesn't require you to enumerate the types up-front.
>>>
>>>
>>> It looks like I’m missing a piece of something as it is not clear why is
>>> this strictly needed. I may have to actually look closer at the code itself.
>>>
>>>
>>> * If this RTTI system is considered generically useful it could be split
>>> out into its own utility. It's slightly higher cost than LLVM's system: One
>>> byte of BSS per type, and a walk from the dynamic type of the error to the
>>> root of the type-hierarchy (with possible early exit) for each type check.
>>>
>>> > What does success or failure means for the handler?
>>>
>>> It gives the handler an opportunity to inspect and then "re-throw" an
>>> error if necessary: A handler might not know whether it can recover based
>>> on type alone, or it may not want to recover at all, but instead attach
>>> some context to provide a richer diagnostic.
>>>
>>> As a concrete example, one of our motivating cases is processing object
>>> files in archives. Down in the object file processing code, a load command
>>> might be found to be malformed, but at that point there's no context to
>>> tell us that the object that it's in is part of an archive, so the best
>>> diagnostic we could produce is "In foo.o: malformed load command at index
>>> N". A (straw-man) improved system might look like this:
>>>
>>> class ObjectError ... { // <- Root of all object-file errors
>>>   std::string ArchiveName = "";
>>>   std::string ObjectName = "";
>>>   std::error_code EC;
>>>
>>>   void log(raw_ostream &OS) const override {
>>>     if (!ArchiveName.empty())
>>>       OS << "In archive '" << ArchiveName << "', ";
>>>     OS << "In object file '" << ObjectName << "', " << EC.message();
>>>   }
>>> };
>>>
>>> TypedError processArchive(Archive &A) {
>>>   TypedError Err;
>>>   for (auto &Obj : A) {
>>>     auto Err = processObject(Obj);
>>>     if (auto E2 =
>>>       catchTypedErrors(std::move(Err),
>>>         handleTypedError<ObjectError>([&](std::unique_ptr<ObjectError>
>>> OE) {
>>>           OE->ArchiveName = A.getName();
>>>           return TypedError(std::move(OE));
>>>         }))
>>>      return E2;
>>>   }
>>> }
>>>
>>> In this example, any error (whether an ObjectError or something else)
>>> will be intercepted by the 'catchTypedErrors' function. If the error
>>> *isn't* an ObjectError it'll be returned unmodified out of
>>> catchTypedErrors, triggering an immediate return from processArchive. If it
>>> *is* an ObjectError then the handler will be run, giving us an opportunity
>>> to tag the error as having occurred within archive A.
>>>
>>> Again - this is a straw-man example: I think we can do better again for
>>> diagnostics of this kind, but it showcases the value of being able to
>>> modify errors while they're in-flight.
>>>
>>>
>>> Sure, this case shows “success” of the handler, now what is a failure of
>>> the handler and how is it handled?
>>>
>>>
>>>
>>> > Is your call to catchAllTypedErrors(…) actually like a switch on the
>>> type of the error? What about a syntax that looks like a switch?
>>> >
>>> > switchErr(std::move(Err))
>>> >    .case< MyCustomError>([] () { /* … */ })
>>> >    .case< MyOtherCustomError>([] () { /* … */ })
>>> >    .default([] () { /* … */ })
>>>
>>> It's similar to a switch, but it's actually more like a list of regular
>>> C++ exception catch blocks (the name 'catchTypedError' is a nod to this).
>>> The big difference is that you're not trying to find "the matching
>>> handler" in the set of options. Instead, the list of handlers is evaluated
>>> in order until one is found that fits, then that handler alone is executed.
>>> So if you had the following:
>>>
>>> class MyBaseError : public TypedErrorInfo<MyBaseError> {};
>>> class MyDerivedError : public TypedErrorInfo<MyDerivedError,
>>> MyBaseError> {}; // <- MyDerivedError inherits from MyBaseError.
>>>
>>> and you wrote something like this:
>>>
>>> catchTypedErrors(std::move(Err),
>>>   handleTypedError<MyBaseError>([&](std::unique_ptr<MyBaseError> B) {
>>>
>>>   }),
>>>   handleTypedError<MyDerivedError>([&](std::unique_ptr<MyDerivedError>
>>> D) {
>>>
>>>   })
>>> );
>>>
>>>
>>> The second handler will never run: All 'Derived' errors are 'Base'
>>> errors, the first handler fits, so it's the one that will be run.
>>>
>>> We could go for something more like a switch, but then you have to
>>> define the notion of "best fit" for a type, which might be difficult
>>> (especially if I extend this to support multiple inheritance in error
>>> hierarchies. ;). I think it's easier to reason about "first handler that
>>> fits”.
>>>
>>>
>>> Oh I was seeing it as a “first match as well”, just bike shedding the
>>> syntax as the function calls with a long flat list of lambdas as argument
>>> didn’t seem like the best we can do at the first sight.
>>>
>>> —
>>> Mehdi
>>>
>>>
>>>
>>> Cheers,
>>> Lang.
>>>
>>>
>>> On Tue, Feb 2, 2016 at 6:33 PM, Mehdi Amini <mehdi.amini at apple.com>
>>> wrote:
>>>
>>>> Hi Lang,
>>>>
>>>> I’m glad someone tackle this long lived issue :)
>>>> I’ve started to think about it recently but didn’t as far as you did!
>>>>
>>>> On Feb 2, 2016, at 5:29 PM, Lang Hames via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I've been thinking lately about how to improve LLVM's error model and
>>>> error reporting. A lack of good error reporting in Orc and MCJIT has forced
>>>> me to spend a lot of time investigating hard-to-debug errors that could
>>>> easily have been identified if we provided richer error information to the
>>>> client, rather than just aborting. Kevin Enderby has made similar
>>>> observations about the state of libObject and the difficulty of producing
>>>> good error messages for damaged object files. I expect to encounter more
>>>> issues like this as I continue work on the MachO side of LLD. I see
>>>> tackling the error modeling problem as a first step towards improving error
>>>> handling in general: if we make it easy to model errors, it may pave the
>>>> way for better error handling in many parts of our libraries.
>>>>
>>>> At present in LLVM we model errors with std::error_code (and its
>>>> helper, ErrorOr) and use diagnostic streams for error reporting. Neither of
>>>> these seem entirely up to the job of providing a solid error-handling
>>>> mechanism for library code. Diagnostic streams are great if all you want to
>>>> do is report failure to the user and then terminate, but they can't be used
>>>> to distinguish between different kinds of errors
>>>>
>>>>
>>>> I’m not sure to understand this claim? You are supposed to be able to
>>>> extend and subclass the type of diagnostics? (I remember doing it for an
>>>> out-of-tree LLVM-based project).
>>>>
>>>>
>>>> , and so are unsuited to many use-cases (especially error recovery). On
>>>> the other hand, std::error_code allows error kinds to be distinguished, but
>>>> suffers a number of drawbacks:
>>>>
>>>> 1. It carries no context: It tells you what went wrong, but not where
>>>> or why, making it difficult to produce good diagnostics.
>>>> 2. It's extremely easy to ignore or forget: instances can be silently
>>>> dropped.
>>>> 3. It's not especially debugger friendly: Most people call the
>>>> error_code constructors directly for both success and failure values.
>>>> Breakpoints have to be set carefully to avoid stopping when success values
>>>> are constructed.
>>>>
>>>> In fairness to std::error_code, it has some nice properties too:
>>>>
>>>> 1. It's extremely lightweight.
>>>> 2. It's explicit in the API (unlike exceptions).
>>>> 3. It doesn't require C++ RTTI (a requirement for use in LLVM).
>>>>
>>>> To address these shortcomings I have prototyped a new error-handling
>>>> scheme partially inspired by C++ exceptions. The aim was to preserve the
>>>> performance and API visibility of std::error_code, while allowing users to
>>>> define custom error classes and inheritance relationships between them. My
>>>> hope is that library code could use this scheme to model errors in a
>>>> meaningful way, allowing clients to inspect the error information and
>>>> recover where possible, or provide a rich diagnostic when aborting.
>>>>
>>>> The scheme has three major "moving parts":
>>>>
>>>> 1. A new 'TypedError' class that can be used as a replacement for
>>>> std::error_code. E.g.
>>>>
>>>> std::error_code foo();
>>>>
>>>> becomes
>>>>
>>>> TypedError foo();
>>>>
>>>> The TypedError class serves as a lightweight wrapper for the real error
>>>> information (see (2)). It also contains a 'Checked' flag, initially set to
>>>> false, that tracks whether the error has been handled or not. If a
>>>> TypedError is ever destructed without being checked (or passed on to
>>>> someone else) it will call std::terminate(). TypedError cannot be silently
>>>> dropped.
>>>>
>>>>
>>>> I really like the fact that not checking the error triggers an error
>>>> (this is the "hard to misuse” part of API design IMO).
>>>> You don’t mention it, but I’d rather see this “checked” flag compiled
>>>> out with NDEBUG.
>>>>
>>>>
>>>> 2. A utility class, TypedErrorInfo, for building error class
>>>> hierarchies rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.
>>>>
>>>> // Define a new error type implicitly inheriting from
>>>> TypedErrorInfoBase.
>>>> class MyCustomError : public TypedErrorInfo<MyCustomError> {
>>>> public:
>>>>   // Custom error info.
>>>> };
>>>>
>>>> // Define a subclass of MyCustomError.
>>>> class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
>>>> MyCustomError> {
>>>> public:
>>>>   // Extends MyCustomError, adds new members.
>>>> };
>>>>
>>>> 3.  A set of utility functions that use the custom RTTI system to
>>>> inspect and handle typed errors. For example 'catchAllTypedErrors' and
>>>> 'handleTypedError' cooperate to handle error instances in a type-safe way:
>>>>
>>>> TypedError foo() {
>>>>   if (SomeFailureCondition)
>>>>     return make_typed_error<MyCustomError>();
>>>> }
>>>>
>>>> TypedError Err = foo();
>>>>
>>>> catchAllTypedErrors(std::move(Err),
>>>>   handleTypedError<MyCustomError>(
>>>>     [](std::unique_ptr<MyCustomError> E) {
>>>>       // Handle the error.
>>>>       return TypedError(); // <- Indicate success from handler.
>>>>
>>>>
>>>> What does success or failure means for the handler?
>>>>
>>>>
>>>>     }
>>>>   )
>>>> );
>>>>
>>>>
>>>> If your initial reaction is "Too much boilerplate!" I understand, but
>>>> take comfort: (1) In the overwhelmingly common case of simply returning
>>>> errors, the usage is identical to std::error_code:
>>>>
>>>> if (TypedError Err = foo())
>>>>   return Err;
>>>>
>>>> and (2) the boilerplate for catching errors is usually easily contained
>>>> in a handful of utility functions, and tends not to crowd the rest of your
>>>> source code. My initial experiments with this scheme involved updating many
>>>> source lines, but did not add much code at all beyond the new error classes
>>>> that were introduced.
>>>>
>>>>
>>>> I believe that this scheme addresses many of the shortcomings of
>>>> std::error_code while maintaining the strengths:
>>>>
>>>> 1. Context - Custom error classes enable the user to attach as much
>>>> contextual information as desired.
>>>>
>>>> 2. Difficult to drop - The 'checked' flag in TypedError ensures that it
>>>> can't be dropped, it must be explicitly "handled", even if that only
>>>> involves catching the error and doing nothing.
>>>>
>>>> 3. Debugger friendly - You can set a breakpoint on any custom error
>>>> class's constructor to catch that error being created. Since the error
>>>> class hierarchy is rooted you can break on
>>>> TypedErrorInfoBase::TypedErrorInfoBase to catch any error being raised.
>>>>
>>>> 4. Lightweight - Because TypedError instances are just a pointer and a
>>>> checked-bit, move-constructing it is very cheap. We may also want to
>>>> consider ignoring the 'checked' bit in release mode, at which point
>>>> TypedError should be as cheap as std::error_code.
>>>>
>>>>
>>>> Oh here you mention compiling out the “checked” flag :)
>>>>
>>>>
>>>> 5. Explicit - TypedError is represented explicitly in the APIs, the
>>>> same as std::error_code.
>>>>
>>>> 6. Does not require C++ RTTI - The custom RTTI system does not rely on
>>>> any standard C++ RTTI features.
>>>>
>>>> This scheme also has one attribute that I haven't seen in previous
>>>> error handling systems (though my experience in this area is limited):
>>>> Errors are not copyable, due to ownership semantics of TypedError. I think
>>>> this actually neatly captures the idea that there is a chain of
>>>> responsibility for dealing with any given error. Responsibility may be
>>>> transferred (e.g. by returning it to a caller), but it cannot be duplicated
>>>> as it doesn't generally make sense for multiple people to report or attempt
>>>> to recover from the same error.
>>>>
>>>> I've tested this prototype out by threading it through the
>>>> object-creation APIs of libObject and using custom error classes to report
>>>> errors in MachO headers. My initial experience is that this has enabled
>>>> much richer error messages than are possible with std::error_code.
>>>>
>>>> To enable interaction with APIs that still use std::error_code I have
>>>> added a custom ECError class that wraps a std::error_code, and can be
>>>> converted back to a std::error_code using the typedErrorToErrorCode
>>>> function. For now, all custom error code classes should (and do, in the
>>>> prototype) derive from this utility class. In my experiments, this has made
>>>> it easy to thread TypedError selectively through parts of the API.
>>>> Eventually my hope is that TypedError could replace std::error_code for
>>>> user-facing APIs, at which point custom errors would no longer need to
>>>> derive from ECError, and ECError could be relegated to a utility for
>>>> interacting with other codebases that still use std::error_code.
>>>>
>>>> So - I look forward to hearing your thoughts. :)
>>>>
>>>>
>>>> Is your call to catchAllTypedErrors(…) actually like a switch on the
>>>> type of the error? What about a syntax that looks like a switch?
>>>>
>>>> switchErr(std::move(Err))
>>>>     .case< MyCustomError>([] () { /* … */ })
>>>>     .case< MyOtherCustomError>([] () { /* … */ })
>>>>     .default([] () { /* … */ })
>>>>
>>>>
>>>> —
>>>> Mehdi
>>>>
>>>>
>>>> Cheers,
>>>> Lang.
>>>>
>>>> Attached files:
>>>>
>>>> typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
>>>> anchor() method to lib/Support/ErrorHandling.cpp).
>>>>
>>>> error_demo.tgz - Stand-alone program demo'ing basic use of the
>>>> TypedError API.
>>>>
>>>> libobject_typed_error_demo.patch - Threads TypedError through the
>>>> binary-file creation methods (createBinary, createObjectFile, etc).
>>>> Proof-of-concept for how TypedError can be integrated into an existing
>>>> system.
>>>>
>>>> <typed_error.patch><error_demo.tgz>
>>>> <thread_typederror_through_object_creation.patch>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160203/9d74c0da/attachment-0001.html>