[llvm-dev] [RFC] Error handling in LLVM libraries.

Wed Feb 3 10:18:36 PST 2016

Hi Craig,

> TypedError Err = foo();
> // no checking in between
> Err = foo();

This will cause an abort - the assignment operator for TypedError checks
that you're not overwriting an unhanded error.

> TypedError Err = foo();
> functionWithHorribleSideEffects();
> if (Err) return;

This is potentially reasonable code - it's impossible to distinguish in
general from:

TypedError Err = foo();
functionWithPerfectlyReasonableSideEffects();
if (Err) return;

That said, to avoid problems related to this style we can offer style
guidelines. Idiomatic usage of the system looks like:

if (auto Err = foo())
  return Err;
functionWithHorribleSideEffects();

This is how people tend to write error checks in most of the LLVM code I've
seen to date.

> Do you anticipate giving these kinds of errors to out of tree projects?
If so, are there any kind of binary compatibility guarantee?

Out of tree projects can use the TypedError.h header and derive their own
error classes. This is all pure C++, I don't think there are binary
compatibility issues.

> What about errors that should come out of constructors?  Or <shudder>
destructors?

TypedError can't be "thrown" in the same way that C++ exceptions can. It's
an ordinary C++ value. You can't return an error from a constructor, but
you can pass a reference to an error in and set that. In general the style
guideline for a "may-fail" constructors would be to write something like
this:

class Foo {
public:

  static TypedErrorOr<Foo> create(int X, int Y) {
    TypedError Err;
    Foo F(X, Y, Err);
    if (Err)
      return std::move(Err);
    return std::move(F);
  }

private:
  Foo(int x, int y, TypedError &Err) {
    if (x == y) {
      Err = make_typed_error<BadFoo>();
      return;
    }
  }
};

Then you have:

TypedErrorOr<Foo> F = Foo::create(X, Y);

The only way to catch failure of a destructor is for the class to hold a
reference to a TypedError, and set that. This is extremely difficult to do
correctly, but as far is I know all error schemes suffer from poor
interaction with destructors. In LLVM failing destructors are very rare, so
I don't anticipate this being a problem in general.

>  If a constructor fails and doesn't establish it's invariant, what will
prevent the use of that invalid object?

If the style guideline above is followed the invalid object will never be
returned to the user. Care must be taken to ensure that the destructor can
destruct the partially constructed object, but that's always the case.

>  How many subclasses do you expect to make of TypedError? Less than 10?
More than 100?

This is a support library, so it's not possible to reason about how many
external clients will want to use it in their projects, or how many errors
they would define. In LLVM I'd like to see us adopt a 'less-is-more'
approach: New error types should be introduced sparingly, and each new
error type should require a rationale for its existence. In particular,
distinct error types should only be introduced when it's reasonable for
some client to make a meaningful distinction between them. If an error is
only being returned in order to produce a string diagnostic, a generic
StringDiagnosticError should suffice.

Answering your question more directly: In the LLVM code I'm familiar with I
can see room for more than 10 error types, but fewer than 100.

> How common is it to want to handle a specific error code in a non-local
way?  In my experience, I either want a specific error handled locally, or
a fail / not-failed from farther away.  The answer to this question may
influence the number of subclasses you want to make.

Errors usually get handled locally, or just produce a diagnostic and
failure, however there are some cases where we want non-local recovery from
specific errors. The archive-walking example I gave earlier is one such
case. You're right on the point about subclasses too - that's what I was
hoping to capture with my comment above: only introduce an error type if
it's meaningful for a client to distinguish it from other errors.

 > Are file, line number, and / or call stack information captured?  I've
found file and line number information to be incredibly useful from a
productivity standpoint.

I think that information is helpful for programmatic errors, but those are
better represented by asserts or "report_fatal_error". This system is
intended to support modelling of non-programmatic errors - bad input,
resource failures and the like. For those, the specific point in the code
where the error was triggered is less useful. If such information is
needed, this system makes it easy to break on the failure point in a
debugger.

Cheers,
Lang.

On Wed, Feb 3, 2016 at 6:15 AM, Craig, Ben via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I've had some experience dealing with rich error descriptions without
> exceptions before.  The scheme I used was somewhat similar to what you
> have.  Here are some items to consider.
>
> * How will the following code be avoided?  The answer may be compile time
> error, runtime error, style recommendations, or maybe something else.
>
> TypedError Err = foo();
> // no checking in between
> Err = foo();
>
> * How about this?
>
> TypedError Err = foo();
> functionWithHorribleSideEffects();
> if(Err) return;
>
> * Do you anticipate giving these kinds of errors to out of tree projects?
> If so, are there any kind of binary compatibility guarantee?
>
> * What about errors that should come out of constructors?  Or <shudder>
> destructors?
>
> * If a constructor fails and doesn't establish it's invariant, what will
> prevent the use of that invalid object?
>
> * How many subclasses do you expect to make of TypedError? Less than 10?
> More than 100?
>
> * How common is it to want to handle a specific error code in a non-local
> way?  In my experience, I either want a specific error handled locally, or
> a fail / not-failed from farther away.  The answer to this question may
> influence the number of subclasses you want to make.
>
> * Are file, line number, and / or call stack information captured?  I've
> found file and line number information to be incredibly useful from a
> productivity standpoint.
>
>
> On 2/2/2016 7:29 PM, Lang Hames via llvm-dev wrote:
>
> Hi All,
>
> I've been thinking lately about how to improve LLVM's error model and
> error reporting. A lack of good error reporting in Orc and MCJIT has forced
> me to spend a lot of time investigating hard-to-debug errors that could
> easily have been identified if we provided richer error information to the
> client, rather than just aborting. Kevin Enderby has made similar
> observations about the state of libObject and the difficulty of producing
> good error messages for damaged object files. I expect to encounter more
> issues like this as I continue work on the MachO side of LLD. I see
> tackling the error modeling problem as a first step towards improving error
> handling in general: if we make it easy to model errors, it may pave the
> way for better error handling in many parts of our libraries.
>
> At present in LLVM we model errors with std::error_code (and its helper,
> ErrorOr) and use diagnostic streams for error reporting. Neither of these
> seem entirely up to the job of providing a solid error-handling mechanism
> for library code. Diagnostic streams are great if all you want to do is
> report failure to the user and then terminate, but they can't be used to
> distinguish between different kinds of errors, and so are unsuited to many
> use-cases (especially error recovery). On the other hand, std::error_code
> allows error kinds to be distinguished, but suffers a number of drawbacks:
>
> 1. It carries no context: It tells you what went wrong, but not where or
> why, making it difficult to produce good diagnostics.
> 2. It's extremely easy to ignore or forget: instances can be silently
> dropped.
> 3. It's not especially debugger friendly: Most people call the error_code
> constructors directly for both success and failure values. Breakpoints have
> to be set carefully to avoid stopping when success values are constructed.
>
> In fairness to std::error_code, it has some nice properties too:
>
> 1. It's extremely lightweight.
> 2. It's explicit in the API (unlike exceptions).
> 3. It doesn't require C++ RTTI (a requirement for use in LLVM).
>
> To address these shortcomings I have prototyped a new error-handling
> scheme partially inspired by C++ exceptions. The aim was to preserve the
> performance and API visibility of std::error_code, while allowing users to
> define custom error classes and inheritance relationships between them. My
> hope is that library code could use this scheme to model errors in a
> meaningful way, allowing clients to inspect the error information and
> recover where possible, or provide a rich diagnostic when aborting.
>
> The scheme has three major "moving parts":
>
> 1. A new 'TypedError' class that can be used as a replacement for
> std::error_code. E.g.
>
> std::error_code foo();
>
> becomes
>
> TypedError foo();
>
> The TypedError class serves as a lightweight wrapper for the real error
> information (see (2)). It also contains a 'Checked' flag, initially set to
> false, that tracks whether the error has been handled or not. If a
> TypedError is ever destructed without being checked (or passed on to
> someone else) it will call std::terminate(). TypedError cannot be silently
> dropped.
>
> 2. A utility class, TypedErrorInfo, for building error class hierarchies
> rooted at 'TypedErrorInfoBase' with custom RTTI. E.g.
>
> // Define a new error type implicitly inheriting from TypedErrorInfoBase.
> class MyCustomError : public TypedErrorInfo<MyCustomError> {
> public:
>   // Custom error info.
> };
>
> // Define a subclass of MyCustomError.
> class MyCustomSubError : public TypedErrorInfo<MyCustomSubError,
> MyCustomError> {
> public:
>   // Extends MyCustomError, adds new members.
> };
>
> 3.  A set of utility functions that use the custom RTTI system to inspect
> and handle typed errors. For example 'catchAllTypedErrors' and
> 'handleTypedError' cooperate to handle error instances in a type-safe way:
>
> TypedError foo() {
>   if (SomeFailureCondition)
>     return make_typed_error<MyCustomError>();
> }
>
> TypedError Err = foo();
>
> catchAllTypedErrors(std::move(Err),
>   handleTypedError<MyCustomError>(
>     [](std::unique_ptr<MyCustomError> E) {
>       // Handle the error.
>       return TypedError(); // <- Indicate success from handler.
>     }
>   )
> );
>
>
> If your initial reaction is "Too much boilerplate!" I understand, but take
> comfort: (1) In the overwhelmingly common case of simply returning errors,
> the usage is identical to std::error_code:
>
> if (TypedError Err = foo())
>   return Err;
>
> and (2) the boilerplate for catching errors is usually easily contained in
> a handful of utility functions, and tends not to crowd the rest of your
> source code. My initial experiments with this scheme involved updating many
> source lines, but did not add much code at all beyond the new error classes
> that were introduced.
>
>
> I believe that this scheme addresses many of the shortcomings of
> std::error_code while maintaining the strengths:
>
> 1. Context - Custom error classes enable the user to attach as much
> contextual information as desired.
>
> 2. Difficult to drop - The 'checked' flag in TypedError ensures that it
> can't be dropped, it must be explicitly "handled", even if that only
> involves catching the error and doing nothing.
>
> 3. Debugger friendly - You can set a breakpoint on any custom error
> class's constructor to catch that error being created. Since the error
> class hierarchy is rooted you can break on
> TypedErrorInfoBase::TypedErrorInfoBase to catch any error being raised.
>
> 4. Lightweight - Because TypedError instances are just a pointer and a
> checked-bit, move-constructing it is very cheap. We may also want to
> consider ignoring the 'checked' bit in release mode, at which point
> TypedError should be as cheap as std::error_code.
>
> 5. Explicit - TypedError is represented explicitly in the APIs, the same
> as std::error_code.
>
> 6. Does not require C++ RTTI - The custom RTTI system does not rely on any
> standard C++ RTTI features.
>
> This scheme also has one attribute that I haven't seen in previous error
> handling systems (though my experience in this area is limited): Errors are
> not copyable, due to ownership semantics of TypedError. I think this
> actually neatly captures the idea that there is a chain of responsibility
> for dealing with any given error. Responsibility may be transferred (e.g.
> by returning it to a caller), but it cannot be duplicated as it doesn't
> generally make sense for multiple people to report or attempt to recover
> from the same error.
>
> I've tested this prototype out by threading it through the object-creation
> APIs of libObject and using custom error classes to report errors in MachO
> headers. My initial experience is that this has enabled much richer error
> messages than are possible with std::error_code.
>
> To enable interaction with APIs that still use std::error_code I have
> added a custom ECError class that wraps a std::error_code, and can be
> converted back to a std::error_code using the typedErrorToErrorCode
> function. For now, all custom error code classes should (and do, in the
> prototype) derive from this utility class. In my experiments, this has made
> it easy to thread TypedError selectively through parts of the API.
> Eventually my hope is that TypedError could replace std::error_code for
> user-facing APIs, at which point custom errors would no longer need to
> derive from ECError, and ECError could be relegated to a utility for
> interacting with other codebases that still use std::error_code.
>
> So - I look forward to hearing your thoughts. :)
>
> Cheers,
> Lang.
>
> Attached files:
>
> typed_error.patch - Adds include/llvm/Support/TypedError.h (also adds
> anchor() method to lib/Support/ErrorHandling.cpp).
>
> error_demo.tgz - Stand-alone program demo'ing basic use of the TypedError
> API.
>
> libobject_typed_error_demo.patch - Threads TypedError through the
> binary-file creation methods (createBinary, createObjectFile, etc).
> Proof-of-concept for how TypedError can be integrated into an existing
> system.
>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160203/516f6ea0/attachment-0001.html>