[LLVMdev] ARM unwinding bug

Wed Jul 29 12:14:26 PDT 2015

> From: Renato Golin <renato.golin at linaro.org>

> 

> > On 29 July 2015 at 16:53, Mason Wheeler <masonwheeler at yahoo.com> wrote:
> > A couple weeks ago, Ben Pye, a developer working on the ARM32 stuff, found
> > and reported a bug related to incorrect generation of stack unwinding info.
> > ( https://llvm.org/bugs/show_bug.cgi?id=24146 ) Apparently it only occurs
> > under a highly specific set of circumstances, which might look like a minor
> > corner case, except that that just happens to be exactly the configuration
> > needed to build CoreCLR for ARM32.
> 
> Hi Mason,
> 
> I did see the bug, but I have to say that, as priorities go, that's
> way too low on my list to go past the initial look.
> 
> I compiled your test on a Cortex-A15 / Linux with Clang-3.8 (trunk)
> and GCC 4.8.2 and on both occasions, the program crashes on
> _Unwind_VRS_Pop() while stepping.
> 
> I, then, left it for the unwinding experts, which I'm not.

Well, yes, an unwinding expert *was* who I was really hoping to hear from.  But
if I understand correctly, you're saying that rather than seeing the values Ben
reported, the sample code crashes on you on both compilers?  I do notice that
you're using different versions of both compilers than he was, which may or may
not be relevant.  What hardware was this on?  Ben said he's been using a
Raspberry Pi, not sure which model.

I'll ask him to weigh in on here with additional details.  Unfortunately he's
in a different time zone and probably asleep at the moment, so it might be a
while.  In the meantime, can you provide details of the crash you're observing?
Just saying "it crashes on FooBarBaz," which is not even mentioned anywhere in
the source of the test case, doesn't provide much useful information.  At the
risk of trotting out one of the oldest cliches in the book, this works (or at
least breaks as described) on our end! :P

> >  This is a blocking problem for us, and I
> > was just wondering how things are coming on it?  Has anyone been looking
> > into this, who might be able to provide some sort of estimate as to what's
> > going on and when we might expect to see a fix?
> 
> I think you're looking at it from the wrong angle...
> 
> I seriously doubt that anyone besides your group will care much about

> CLR on Android.

Just off the top of my head, everyone using Xamarin and a significant fraction
of the Unity3D community will care about this.  Plus a non-trivial percentage
of approximately 20 million existing .NET devs, the ones who would like to
produce mobile apps if the existing "solutions" weren't either horribly expensive
or barely workable and low-quality.  I can definitely say this is not something
that no one will care about!

> So, unless this bug affects other people, not much will progress without
> your help.

Very well. What help do you need? I'd be happy to provide any assistance
necessary.

> Posting a bug on bugzilla is the first step. Having a source file,

> some command lines to try out and some expected results is the second.
> Adding the right people (Anton, me, Logan) is the third step.
> Excellent. But it doesn't stop there.
> 
> You have to continue to investigate, step through, disassemble, check
> the unwind sources, give us some ideas. The closer you get to a
> recognisably obvious problem, the easier it will be to get someone
> else's attention.
>
> We all got an email from that bug (because you CC us), and we probably
> all looked at it and thought: "hum, that's weird. Let me try". Because
> that's so far off our current priorities, when it failed at the very

> step, you think, "I'll wait until the OP provides more info".

Yeah, I wondered if it might not be something like that.  We were all waiting
to hear back from you, because from our perspective we've provided a complete,
reproducible test case, and if anything more is needed, we'd expect
someone to respond to the Bugzilla post asking about it.  Good thing I
checked, then! :)

> This> would go a long way to motivate people to reply, even with a simple
> "have you tried this?". As it stands, I don't even know what to say,
> honestly.

What I would say, in your position, is I'd post a response to the Bugzilla
entry detailing my problems so that the original author would see it and
be able to improve the bug report.

> For example, I also found a bug in the unwinder:
>
> https://llvm.org/bugs/show_bug.cgi?id=24273
> 
> I did more or less what you did, put some info on how to reproduce.
> But I'm not expecting anyone to look at it. That's for me to keep
> there, in case I have some more time to investigate it. If either
> Logan or Anton had a look, that'd be awesome! But I don't expect them
> to. If I want that one to progress, I'll have to do a lot more than

> what I did.
Forgive my saying so, but that seems like a bizarre attitude, as you've
already admitted that you don't know much about unwinding.  Isn't that the
entire point of specialization?  Having people who have their own area that
they know well and become expert in, all working together, is what lifted
mankind out of the subsistence agrarian lifestyle and enabled us to build
our way up to the modern age.

If something went wrong with my car, I certainly wouldn't attempt to diagnose
it myself and figure out the root cause so the guys at the dealership would
have an easier job of it, because tracking down the problem when I lack the
training, experience and tools to do so would waste a great deal of my time
on something that the dealership's experts could presumably figure out far
more quickly and accurately than I could.  Everyone can agree that this is
perfectly reasonable when talking about cars, so why, when it comes to code,
does the exact opposite philosophy tend to appear?

Mason