[LLVMdev] Making LLVM safer in out-of-memory situations

Fri Dec 20 10:09:53 PST 2013

On 12/20/13 5:21 AM, Gasiunas, Vaidas wrote:
> Hi Philip,
>
>> If I'm reading you correctly, you are relying on exception propagation
>> and handler (destructors for local objects) execution. You have chosen
>> not to add extra exception logic to LLVM itself, but are relying on the
>> correctness of exception propagation within the code.  (The last two
>> sentances are intended to be a restatement of what your message said.
>> If I misunderstood, please correct me.)
> It was probably not completely correct to say that we did not extend
> the exception propagation in LLVM. In most of cases where malloc or
> other C allocation functions are called, we had to add a check for NULL
> and throw std::bad_alloc. But these are a kind of straightforward fixes,
> that do not require much effort.
>
>> Does this mean that you're compiling your build of LLVM with exceptions
>> enabled?  By default, I believe LLVM is built without RTTI or exception
>> support.
> OK, I see. This explains why the destructors in LLVM are not always
> prepared to be executed in exception situations. Yes, we build LLVM
> with exception support. In principle, we build it with the same options
> like the rest of our project.
This is useful to know.  Just fair warning, you're probably running a 
fairly odd configuration compared to the rest of the LLVM community.  
That might expose "interesting" bugs.  (Note: I have no evidence that 
the exceptions enabled config is actually rare, just my perception 
watching list traffic.)
> Actually, I could hardly imagine that
> we could handle OOM situations without error handling.
Agreed, but error handling does not imply exceptions.  It might be the 
easiest way, but it's not the only one.
>> For the particular cases you mentioned with auto pointers and allocation
>> in destructors, are these issues also present along existing error
>> paths?  Or for that matter simply examples of bad coding practice?  If
>> so, pushing back selected changes would be welcomed.  I'd be happy to
>> help review.
> Yes, there are some examples of bad coding practice. The root problem,
> however, is that the destructors in LLVM do a lot of complicated stuff.
> Instead of just deleting objects following a strictly hierarchical ownership
> structure, the objects are unregistered in various relationships, which
> potentially trigger unwinding in quite different locations. Such non-trivial
> coding sometimes requires dynamic allocation of new collections, which is
> problematic in OOM situations.
This is probably not going to change.

One simple hack to get around this would be to have a reserved 
allocation set which is released on OOM.  This would enable a small 
number of allocations during unwind without double-OOM.  Getting the 
allocation amount right is a bit challenging, but such schemes can be 
made to work.

> Actually, we did not manage to completely
> fix the unwinding of the compiler state. That was one of the reasons to
> move all compiler passes to a separate process.
>
> Here is a rough overview of our current set of patches to LLVM3.3. In total,
> we have 31 patches to LLVM related to OOM handing. They fix only the
> components that we could not outsource to the separate process:
> the core IR classes that we use for IR generation, IR bitcode serialization,
> and the dynamic code loader.
I'm about to break these down by how likely I believe these changes are 
to be accepted back if you wanted to push them.  Keep in mind that this 
is only my opinion and that I do not speak for the community as a whole.
>
> * In 8 of the patches we fix the malloc calls to throw bad_alloc.
This is probably not going to be accepted as is.
> * 10 patches fix destructors. Some of fixes are about disabling or
> rewriting the code triggering dynamic allocations. Some other fixes
> disable asserts that check for invariants that are not valid in exception
> situation.
These might be accepted on a case by case basis.  Depends on the actual 
change in question.
> * 5  patches deal with exceptions in constructors. One kind of problems
> results from the fact that if an exception is thrown in the constructor of
> an object, the destructor is not called which causes a leak.
Framed as exception propagation, these probably wouldn't be accepted.  
However, it sounds like a general error return pattern could address the 
same issue.  That might be accepted.
> There is also a specific problem that IR objects register themselves with
> their parents already in their constructor. And if a constructor fails
> afterwards, the parent contains a dangling pointer to its child.
Same as previous.
> * 5 patches fix temporary ownership of objects, mostly the situations
> when object are created, but not added to the owning collections.
These would probably be accepted.
> Note that 31 patches does not mean 31 fixes, because some of them
> contain all fixes to a particular file or class.
>
How willing are you to share your code?  If you're willing to put your 
changes up on github or something, we could go through them and attempt 
to migrate some of them back for inclusion in base llvm.

Philip