[lldb-dev] More linux process control and IOHandler races
Matthew Gardiner
mg11 at csr.com
Wed Oct 1 22:12:32 PDT 2014
Yeah, cool. Getting these prompt issues sorted will put a smile on a lot
of faces :-)
On Wed, 2014-10-01 at 07:52 -0700, Todd Fiala wrote:
> Thanks, Matthew.
>
>
> I'll have a look at Shawn's changes and get them in after reviewing.
>
> On Tue, Sep 30, 2014 at 11:43 PM, Matthew Gardiner <mg11 at csr.com>
> wrote:
> Hi Shawn,
>
> It would be great for you to fix this and all the other lldb
> prompt
> bugs! Yes, lldb's rotten prompt it bothers me as much as you.
>
> My main problem with this issue, is that I am consuming all my
> time with
> trying to get CSRs kalimba chips debuggable by lldb.
> Particularly
> troublesome being those variants with non-8-bit bytes... I
> wish I could
> spend more this bug, but at the moment it is a peripheral
> concern for
> me.
>
> I have no pride associated with any previous fixes/bodges
> (sleeps/m_first_stop etc.) that I've done to "address" this
> issue. So
> please feel free to solve this exactly as you feel fit.
>
> I'm guessing that Todd is in a much closer time-zone to you,
> so it may
> be better if he applies any of your patches for you. Sorry
> Todd!
>
> thanks for the update
> Matt
>
> On Mon, 2014-09-29 at 15:36 -0700, Shawn Best wrote:
> > Hi All,
> >
> >
> > I have spent some time digging into this again and
> understand it a
> > little better now.
> >
> >
> > Matt, the reason your patch hangs on TestHello world, is
> because the
> > test script is launching the inferior program first then has
> lldb do
> > an attach. This is a different code path than a regular
> launch. The
> > one-shot you put in there for 'm_first_stop' to prevent it
> from
> > broadcasting the first stop, causes it to never get
> restarted.
> >
> >
> > On OSX, the application is launched first and then lldb
> attaches to
> > it. There is also a private stop message generated. The
> reason it is
> > not broadcast, is because of a special class member
> > 'm_resume_requested' and some special machinery ( class
> > AttachCompletionHandler : public NextEventAction ) that is
> used in the
> > case of launching to set it.
> >
> >
> > I could create a similar class "LaunchCompletionHandler" and
> get that
> > working to prevent the stop message from being broadcast.
> >
> >
> > The simplest solution is still my original 2 line fix.
> >
> >
> > I would like to get this fixed as it bothers me every time I
> see the
> > extra spew when debugging. There were also some unit tests
> failing on
> > linux that this patch fixed.
> >
> >
> > Shawn.
> >
> >
> >
> >
> > On Mon, Aug 18, 2014 at 3:08 AM, Matthew Gardiner
> <mg11 at csr.com>
> > wrote:
> > Hi Shawn,
> >
> > Have you spent anymore time looking at this prompt
> issue
> > (where you see the (lldb) prompt even when you use
> "process
> > launch")? I tried fixing it with the attached patch.
> This
> > fixes the issue fine at the lldb command line, but
> results in
> > the python TestHelloWorld test hanging the call to
> > Process::WaitForProcessToStop. I wondered if you had
> spent
> > anymore time on this lately and knew of a
> complete/better
> > fix...
> >
> > thanks
> > Matt
> >
> >
> > Greg Clayton wrote:
> > Jim and I have discussed the right thing to
> do which
> > we plan to implement as soon as we get the
> chance.
> >
> > Current we have two process states: public
> and
> > private. The private state is always up to
> date and is
> > used to track what the process is currently
> doing. The
> > public state gets set when we believe the
> end user
> > should see an event that will update the GUI
> or TTY
> > with the current state if needed (stopped or
> exited).
> > There is a lot of tricky code that uses the
> private
> > process state to manage things like the
> thread plans.
> > Things like source level single step might
> involve
> > starting and stopping and hitting
> breakpoints and
> > single instruction stepping, but the user
> only wants
> > to see the final eStateStopped when the
> source level
> > step is done. Expressions are also tricky,
> the public
> > state of the process is stopped, but when we
> run an
> > expression, we might resume the process many
> times and
> > the user will always see that the public
> state if
> > stopped. Internally though the thread plans
> are doing
> > a whole bunch of stuff to the process and
> the user
> > never hears anything about these (no events
> go
> > public).
> >
> > So the current public and private stuff uses
> > broadcaster hijacking and all sorts of other
> tricks to
> > avoid letting the end user know about the
> starts and
> > stops that they shouldn't know about. We
> eventually
> > want to have a new class, lets call in
> > lldb_private::ProcessState for now, that
> manages the
> > current process state and always receives
> the process
> > events. There would be no more public and
> private
> > events. We would have one version of the
> ProcessState
> > subclass that would cause the GUI/TTY to
> update
> > (replacing the public events) and one
> ProcessState
> > subclass that would manage running thread
> plans. There
> > would be stack of these and this will help
> us avoid
> > the whole public/private/hijacking stuff we
> have now.
> >
> > Launching a process would end up pushing a
> new
> > ProcessState class that could handle all the
> necessary
> > starts and stop we need to get the process
> to a
> > quiescent state that is presentable to the
> user. The
> > ProcessState subclass could eat all the
> events it
> > needs to while launching and then propagate
> the
> > stopped or running event when it gets popped
> off the
> > stack.
> >
> > So we should make some simple fixes for now
> to work
> > around the issue and get things working, but
> it would
> > also be great to get the new system up and
> running so
> > we can make this easier in the future for
> people that
> > want to make changes.
> >
> > Greg
> >
> >
> > On Aug 8, 2014, at 1:08 AM, Matthew
> Gardiner
> > <mg11 at csr.com> wrote:
> >
> > Folks,
> >
> > Regarding "launching using the
> shell", I
> > don't think applies in the buggy
> case that
> > myself and Shawn are looking at.
> >
> > I do:
> >
> > (lldb) target create ./test
> > ...
> > (lldb) process launch
> > ...
> >
> > and when I inspected the call to
> exec, I see
> > that my exe name is the program
> passed.
> >
> > Thanks for the insight into the
> broadcasting
> > of stop events. That explains why I
> see in the
> > ShouldBroadcastEvent, the ShouldStop
> and
> > ShouldReportStop calls.
> >
> > However, it would be nice to know if
> the first
> > stop event should be broadcast for
> "process
> > launch". I think it's an
> implementation
> > detail, and therefore should not.
> That would
> > help to fix this issue.
> >
> > However, Shawn's original suggestion
> to fix
> > this issue circumvents the above
> debate, by
> > replacing the call of
> HandlePrivateEvent with
> > SetPublicState. So which fix is
> best? Calling
> > SetPublicState rather than
> HandlePrivateEvent
> > is certainly more expedient, and
> avoids this
> > debate... but is it more
> > portable/future-proof?
> >
> > Matt
> >
> >
> > jingham at apple.com wrote:
> > If you are launching using
> the shell,
> > you'll see more stops before
> you get
> > to the executable you are
> actually
> > trying to launch. In that
> case,
> > instead of just running the
> binary
> > directly you effectively do:
> >
> > /bin/bash exec <binary> arg1
> arg2
> >
> > so that you can get bash (or
> whatever
> > is set in $SHELL) to do the
> argument
> > expansion for you.
> > GetResumeCountForLaunchInfo
> calculates
> > this, then it is stuffed
> into the
> > ProcessLaunchInfo
> (SetResumeCount).
> > On Mac OS X we always let
> the Platform
> > launch, then attach, so in
> that case
> > the AttachCompletionHandler
> does the
> > extra resumes. I'm not all
> that
> > familiar with how the Linux
> side work,
> > but it also seems to use the
> > ProcessLaunchInfo's resume
> count.
> >
> > Note that in general in lldb
> not all
> > publicly broadcast stop
> messages are
> > going to result in a stop.
> For
> > instance, all the breakpoint
> command &
> > condition handling goes on
> as a result
> > of the broadcast of the
> public stop
> > event, but the process might
> just turn
> > around an continue based on
> that.
> > Whether to suppress the
> broadcast and
> > continue from the private
> state thread
> > or broadcast the event and
> let the
> > upper levels of lldb take
> care of what
> > happens from there on really
> depends
> > on where it makes sense to
> handle the
> > stop. So for the case of
> breakpoint
> > stops (or stop hooks,
> another
> > example), those end up being
> > equivalent to user typed
> commands,
> > just done for the user
> automatically
> > by the system. So having
> them happen
> > in a world where the public
> state
> > wasn't sync'ed up to the
> private state
> > ended up being very awkward.
> >
> > I haven't looked at the
> launching code
> > in detail recently so I am
> not sure
> > whether it makes sense for
> the first
> > stop to be handled as it is.
> >
> > Jim
> >
> >
> > On Aug 7, 2014, at
> 6:29 AM,
> > Matthew Gardiner
> > <mg11 at csr.com>
> wrote:
> >
> > Hi Shawn,
> >
> > I spent some time
> today
> > looking at how to
> arrange for
> > ShouldBroadcast to
> return
> > false for this first
> stop. I
> > managed to produce a
> quick
> > hack for this (i.e.
> just
> > counted the number
> of stops),
> > but due to other
> distractions
> > (from the rest of my
> job) I
> > didn't get that far
> into
> > discovering a nice
> way of
> > achieving this...
> >
> > What I did discover
> is that
> > with my build just
> doing
> > "process launch"
> results in 3
> > stops (and 3 private
> resumes).
> > That in itself I
> find
> > surprising, since I
> was under
> > the impression that
> I should
> > see the inferior
> stop just
> > once when exec is
> trapped by
> > PTRACE_ME.
> >
> > I then discovered
> that for
> > each of these 3
> stops
> > ShouldBroadcast
> calls
> > Thread::ShouldStop,
> the
> > Thread::ShouldStop
> returns
> > true for the first
> stop and
> > false for the other
> 2. Looking
> > into these behaviour
> > differences I then
> found that
> > from within
> Thread::ShouldStop
> > we then call into
> the
> > following:
> >
> > StopInfoSP
> > private_stop_info
> >
> (GetPrivateStopInfo());
> > if
> (private_stop_info &&
> >
> private_stop_info->ShouldStopSynchronous(event_ptr) == false)
> > {
> >
> > and also
> >
> > bool over_ride_stop
> =
> >
> current_plan->ShouldAutoContinue(event_ptr);
> >
> > the results from
> either of
> > these, it seems
> providing the
> > reasoning behind the
> different
> > true/false returns.
> I'll
> > return to spend a
> bit more
> > time on this
> tomorrow. Let me
> > know if you get any
> further on
> > a similar vein!
> >
> > thanks
> > Matt
> >
> >
> >
> > Shawn Best wrote:
> > Matt,
> >
> > I think you
> are
> > probably
> right,
> > although
> there are
> > other places
> where it
> > directly
> calls
> >
> SetPublicState(). I
> > was
> wondering about
> > the
> possibility there
> > could be
> other
> > listeners
> waiting for
> > a broadcast
> public
> > Stop event.
> Is that a
> > possibility?
> >
> > Some others
> here were
> >
> investigating some
> > unit tests
> that were
> > failing
> intermittently
> > (StopHook).
> Their
> > description
> of the
> > problem
> sounds
> > unrelated to
> the
> > launch code,
> but this
> > patch also
> magically
> > fixes that.
> >
> > Shawn.
> >
> > On 8/6/2014
> 6:26 AM,
> > Matthew
> Gardiner
> > wrote:
> >
> Shawn,
> >
> > Like
> I said
> >
> earlier your
> >
> patch worked.
> >
> However I
> >
> think the
> >
> right fix is
> > to
> arrange
> > that
> >
> ShouldBroadcast returns false for this first stop. I believe
> this, because firstly no stops should be reported here since
> the user is only interested in launching a program, and
> additionally because it enables us to fix lldb without
> removing the call to HandlePrivateEvent. This, I think, is
> important to preserve as the central point for process state
> change handling.
> >
> > Matt
> >
> >
> >
> >
> Shawn Best
> >
> wrote:
> >
> Hi
> >
> Matthew,
> >
> >
> I have
> >
> also
> >
> been
> >
> tracking this bug. I believe there are other bugs in the
> unit tests failing indirectly because of this. I also have a
> patch that will fix it, but was sitting on it until the other
> one landed. These bugs do not show up on OSX since the
> inferiors are launched separately then attached to.
> >
> >
> The
> >
> first
> >
> odd
> >
> thing
> >
> the
> >
> launching code does is push an IOHandler when it sees the
> state transition to 'launching'. This is odd because I believe
> the launching program will always come up in a stopped state
> which will immediately pop the IOHandler.
> >
> >
> At
> >
> launch, the process comes up in the stopped state. The
> launch code manually calls HandlePrivateEvent() with the stop
> event, which then broadcasts the Event. When
> HandleProcessEvent gets the public stop, it dumps out the
> current thread state just as if an executing inferior hit a
> breakpoint and stopped.
> >
> >
> One
> >
> way to
> >
> fix
> >
> this
> >
> would
> >
> be:
> >
> >
> 1.
> >
> Don't
> >
> push
> >
> io
> >
> handler when state is 'launching'
> >
> 2.
> >
> Instead of manually calling HandlePrivateEvent, call
> SetPublicState().
> >
> >
> Alternately, we could try and debug why ShouldBroadcast()
> returns true, but that appears to be by design since it is
> expecting the public stop event to pop the IOHandler that had
> been pushed when launching.
> >
> >
> I have
> >
> attached a patch demonstrating this. In conjunction with
> the other patch for IOHandler race condition, it will fix a
> bunch of this kind of behaviour.
> >
> >
> Shawn.
> >
> >
> On
> >
> 8/5/2014 6:59 AM, Matthew Gardiner wrote:
> >
> Jim,
> >
> >
> I've been trying to debug an issue (I see it on
> 64-bit linux) where, I do "target create" and "process launch"
> and despite not requesting *stop at entry*, the first stop
> (which I believe is just the initial ptrace attach stop) is
> reported to the lldb command line. I added some fprintf to
> Process::HandlePrivateEvent, which counts the number of
> eStoppedState events seen and whether ShouldBroadcastEvent
> returns true for this event. Here's the output from my program
> with diagnostic:
> >
> >
> (lldb) target create ~/me/i64-hello.elf
> >
> Current executable set to
> '~/me/i64-hello.elf' (x86_64).
> >
> (lldb) process launch
> >
> MG Process::HandlePrivateEvent launching
> stopped_count 0 should_broadcast 1
> >
> Process 31393 launching
> >
> MG Process::HandlePrivateEvent stopped
> stopped_count 1 should_broadcast 1
> >
> MG Process::HandlePrivateEvent running
> stopped_count 1 should_broadcast 1
> >
> Process 31393 launched: 'i64-hello.elf' (x86_64)
> >
> Process 31393 stopped
> >
> * thread #1: tid = 31393, 0x0000003675a011f0,
> name = 'i64-hello.elf', stop reason = trace
> >
> >
> frame #0: 0x0000003675a011f0
> >
> -> 0x3675a011f0: movq %rsp, %rdi
> >
> 0x3675a011f3: callq 0x3675a046e0
> >
> 0x3675a011f8: movq %rax, %r12
> >
> 0x3675a011fb: movl 0x21eb97(%rip), %eax
> >
> (lldb) MG Process::HandlePrivateEvent stopped
> stopped_count 2 should_broadcast 0
> >
> MG Process::HandlePrivateEvent running
> stopped_count 2 should_broadcast 0
> >
> MG Process::HandlePrivateEvent stopped
> stopped_count 3 should_broadcast 0
> >
> MG Process::HandlePrivateEvent running
> stopped_count 3 should_broadcast 0
> >
> >
> In summary, lldb reports the inferior to be
> stopped (even though /proc/pid/status and lldb "target list"
> say it is running). Clearly this is wrong (hence my earlier
> post).
> >
> >
> Am I correct in assuming that when
> ShouldBroadcastEvent returns true this means that lldb should
> show this event to the debug user? (And thus hide other events
> where ShouldBroadcastEvent=false).
> >
> >
> What puzzled me was why ShouldBroadcastEvent
> return true for this very first stop. Is this a bug?
> >
> >
> I also spent sometime at ShouldBroadcastEvent and
> saw that this:
> >
> >
> >
> case eStateStopped:
> >
> >
> case eStateCrashed:
> >
> >
> case eStateSuspended:
> >
> >
> {
> >
> ....
> >
> >
> if (was_restarted || should_resume ||
> m_resume_requested)
> >
> >
> {
> >
> >
> evaluates as false, and hence the PrivateResume
> code is not called... does this seem buggy to you for this
> very first stop?
> >
> >
> I thought I'd try asking you, since in a previous
> mail from Greg, he cited you as being a thread-plan expert.
> (Hope that's ok!). I'd really appreciate your help in
> clarifying the above questions for me, and if you have time,
> giving me some ideas as to how to trace this one further e.g.
> how m_thread_list.ShouldStop and
> m_thread_list.ShouldReportStop should behave, etc.
> >
> >
> thanks for your help
> >
> Matt
> >
> >
> Matthew Gardiner wrote:
> >
> Folks,
> >
> >
> In addition to the overlapping prompt
> race Shawn Best and myself are looking at, I'm seeing another
> issue where if I launch a process, I get a stop (presumably
> the in) being reported to the UI.
> >
> >
> (lldb) target create
> ~/mydir/i64-hello.elf
> >
> Current executable set to
> '~/mydir/i64-hello.elf' (x86_64).
> >
> (lldb) process launch
> >
> Process 27238 launching
> >
> Process 27238 launched:
> '64-hello.elf' (x86_64)
> >
> Process 27238 stopped
> >
> * thread #1: tid = 27238,
> 0x0000003675a011f0, name = 'i64-hello.elf'
> >
> >
> frame #0:
> >
> (lldb) target list
> >
> Current targets:
> >
> * target #0: i64-hello.elf
> ( arch=x86_64-unknown-linux, platform=host, pid=27238,
> state=running )
> >
> (lldb)
> >
> >
> As you can see the "target list" reflects
> that the process is running. Which I confirmed by looking
> at /proc/27238/status.
> >
> >
> Anyone else seeing this?
> >
> >
> thanks
> >
> Matt
> >
> >
> >
> >
> Member of the CSR plc group of companies.
> CSR plc registered in England and Wales, registered number
> 4187346, registered office Churchill House, Cambridge Business
> Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
> >
> More information can be found at
> www.csr.com <http://www.csr.com>. Keep up to date with CSR on
> our technical blog, www.csr.com/blog
> <http://www.csr.com/blog>, CSR people blog, www.csr.com/people
> <http://www.csr.com/people>, YouTube,
> www.youtube.com/user/CSRplc
> <http://www.youtube.com/user/CSRplc>, Facebook,
> www.facebook.com/pages/CSR/191038434253534
> <http://www.facebook.com/pages/CSR/191038434253534>, or follow
> us on Twitter at www.twitter.com/CSR_plc
> <http://www.twitter.com/CSR_plc>.
> >
> New for 2014, you can now access the wide
> range of products powered by aptX at www.aptx.com
> <http://www.aptx.com>.
> >
>
> _______________________________________________
> >
> lldb-dev mailing list
> >
> lldb-dev at cs.uiuc.edu
> <mailto:lldb-dev at cs.uiuc.edu>
> >
>
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> To report this email as spam click
> https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!
> GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw== <https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==> .
> >
> _______________________________________________
> >
> lldb-dev mailing list
> >
> lldb-dev at cs.uiuc.edu
> <mailto:lldb-dev at cs.uiuc.edu>
> >
>
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at cs.uiuc.edu
> >
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> >
>
>
>
>
>
>
>
> --
> Todd Fiala |
> Software
> Engineer |
> tfiala at google.com
>
>
>
>
More information about the lldb-dev
mailing list