[lldb-dev] More linux process control and IOHandler races

Todd Fiala tfiala at google.com
Wed Oct 1 07:52:47 PDT 2014


Thanks, Matthew.

I'll have a look at Shawn's changes and get them in after reviewing.

On Tue, Sep 30, 2014 at 11:43 PM, Matthew Gardiner <mg11 at csr.com> wrote:

> Hi Shawn,
>
> It would be great for you to fix this and all the other lldb prompt
> bugs! Yes, lldb's rotten prompt it bothers me as much as you.
>
> My main problem with this issue, is that I am consuming all my time with
> trying to get CSRs kalimba chips debuggable by lldb. Particularly
> troublesome being those variants with non-8-bit bytes... I wish I could
> spend more this bug, but at the moment it is a peripheral concern for
> me.
>
> I have no pride associated with any previous fixes/bodges
> (sleeps/m_first_stop etc.) that I've done to "address" this issue. So
> please feel free to solve this exactly as you feel fit.
>
> I'm guessing that Todd is in a much closer time-zone to you, so it may
> be better if he applies any of your patches for you. Sorry Todd!
>
> thanks for the update
> Matt
>
> On Mon, 2014-09-29 at 15:36 -0700, Shawn Best wrote:
> > Hi All,
> >
> >
> > I have spent some time digging into this again and understand it a
> > little better now.
> >
> >
> > Matt, the reason your patch hangs on TestHello world, is because the
> > test script is launching the inferior program first then has lldb do
> > an attach.  This is a different code path than a regular launch. The
> > one-shot you put in there for 'm_first_stop' to prevent it from
> > broadcasting the first stop, causes it to never get restarted.
> >
> >
> > On OSX, the application is launched first and then lldb attaches to
> > it.  There is also a private stop message generated.  The reason it is
> > not broadcast, is because of a special class member
> > 'm_resume_requested' and some special machinery ( class
> > AttachCompletionHandler : public NextEventAction ) that is used in the
> > case of launching to set it.
> >
> >
> > I could create a similar class "LaunchCompletionHandler" and get that
> > working to prevent the stop message from being broadcast.
> >
> >
> > The simplest solution is still my original 2 line fix.
> >
> >
> > I would like to get this fixed as it bothers me every time I see the
> > extra spew when debugging.  There were also some unit tests failing on
> > linux that this patch fixed.
> >
> >
> > Shawn.
> >
> >
> >
> >
> > On Mon, Aug 18, 2014 at 3:08 AM, Matthew Gardiner <mg11 at csr.com>
> > wrote:
> >         Hi Shawn,
> >
> >         Have you spent anymore time looking at this prompt issue
> >         (where you see the (lldb) prompt even when you use "process
> >         launch")? I tried fixing it with the attached patch. This
> >         fixes the issue fine at the lldb command line, but results in
> >         the python TestHelloWorld test hanging the call to
> >         Process::WaitForProcessToStop. I wondered if you had spent
> >         anymore time on this lately and knew of a complete/better
> >         fix...
> >
> >         thanks
> >         Matt
> >
> >
> >         Greg Clayton wrote:
> >                 Jim and I have discussed the right thing to do which
> >                 we plan to implement as soon as we get the chance.
> >
> >                 Current we have two process states: public and
> >                 private. The private state is always up to date and is
> >                 used to track what the process is currently doing. The
> >                 public state gets set when we believe the end user
> >                 should see an event that will update the GUI or TTY
> >                 with the current state if needed (stopped or exited).
> >                 There is a lot of tricky code that uses the private
> >                 process state to manage things like the thread plans.
> >                 Things like source level single step might involve
> >                 starting and stopping and hitting breakpoints and
> >                 single instruction stepping, but the user only wants
> >                 to see the final eStateStopped when the source level
> >                 step is done. Expressions are also tricky, the public
> >                 state of the process is stopped, but when we run an
> >                 expression, we might resume the process many times and
> >                 the user will always see that the public state if
> >                 stopped. Internally though the thread plans are doing
> >                 a whole bunch of stuff to the process and the user
> >                 never hears anything about these (no events go
> >                 public).
> >
> >                 So the current public and private stuff uses
> >                 broadcaster hijacking and all sorts of other tricks to
> >                 avoid letting the end user know about the starts and
> >                 stops that they shouldn't know about. We eventually
> >                 want to have a new class, lets call in
> >                 lldb_private::ProcessState for now, that manages the
> >                 current process state and always receives the process
> >                 events. There would be no more public and private
> >                 events. We would have one version of the ProcessState
> >                 subclass that would cause the GUI/TTY to update
> >                 (replacing the public events) and one ProcessState
> >                 subclass that would manage running thread plans. There
> >                 would be stack of these and this will help us avoid
> >                 the whole public/private/hijacking stuff we have now.
> >
> >                 Launching a process would end up pushing a new
> >                 ProcessState class that could handle all the necessary
> >                 starts and stop we need to get the process to a
> >                 quiescent state that is presentable to the user. The
> >                 ProcessState subclass could eat all the events it
> >                 needs to while launching and then propagate the
> >                 stopped or running event when it gets popped off the
> >                 stack.
> >
> >                 So we should make some simple fixes for now to work
> >                 around the issue and get things working, but it would
> >                 also be great to get the new system up and running so
> >                 we can make this easier in the future for people that
> >                 want to make changes.
> >
> >                 Greg
> >
> >
> >                         On Aug 8, 2014, at 1:08 AM, Matthew Gardiner
> >                         <mg11 at csr.com> wrote:
> >
> >                         Folks,
> >
> >                         Regarding "launching using the shell",  I
> >                         don't think applies in the buggy case that
> >                         myself and Shawn are looking at.
> >
> >                         I do:
> >
> >                         (lldb) target create ./test
> >                         ...
> >                         (lldb) process launch
> >                         ...
> >
> >                         and when I inspected the call to exec, I see
> >                         that my exe name is the program passed.
> >
> >                         Thanks for the insight into the broadcasting
> >                         of stop events. That explains why I see in the
> >                         ShouldBroadcastEvent, the ShouldStop and
> >                         ShouldReportStop calls.
> >
> >                         However, it would be nice to know if the first
> >                         stop event should be broadcast for "process
> >                         launch".  I think it's an implementation
> >                         detail, and therefore should not. That would
> >                         help to fix this issue.
> >
> >                         However, Shawn's original suggestion to fix
> >                         this issue circumvents the above debate, by
> >                         replacing the call of HandlePrivateEvent with
> >                         SetPublicState. So which fix is best? Calling
> >                         SetPublicState rather than HandlePrivateEvent
> >                         is certainly more expedient, and avoids this
> >                         debate... but is it more
> >                         portable/future-proof?
> >
> >                         Matt
> >
> >
> >                         jingham at apple.com wrote:
> >                                 If you are launching using the shell,
> >                                 you'll see more stops before you get
> >                                 to the executable you are actually
> >                                 trying to launch.  In that case,
> >                                 instead of just running the binary
> >                                 directly you effectively do:
> >
> >                                 /bin/bash exec <binary> arg1 arg2
> >
> >                                 so that you can get bash (or whatever
> >                                 is set in $SHELL) to do the argument
> >                                 expansion for you.
> >                                 GetResumeCountForLaunchInfo calculates
> >                                 this, then it is stuffed into the
> >                                 ProcessLaunchInfo (SetResumeCount).
> >                                 On Mac OS X we always let the Platform
> >                                 launch, then attach, so in that case
> >                                 the AttachCompletionHandler does the
> >                                 extra resumes.  I'm not all that
> >                                 familiar with how the Linux side work,
> >                                 but it also seems to use the
> >                                 ProcessLaunchInfo's resume count.
> >
> >                                 Note that in general in lldb not all
> >                                 publicly broadcast stop messages are
> >                                 going to result in a stop.  For
> >                                 instance, all the breakpoint command &
> >                                 condition handling goes on as a result
> >                                 of the broadcast of the public stop
> >                                 event, but the process might just turn
> >                                 around an continue based on that.
> >                                 Whether to suppress the broadcast and
> >                                 continue from the private state thread
> >                                 or broadcast the event and let the
> >                                 upper levels of lldb take care of what
> >                                 happens from there on really depends
> >                                 on where it makes sense to handle the
> >                                 stop.  So for the case of breakpoint
> >                                 stops (or stop hooks, another
> >                                 example), those end up being
> >                                 equivalent to user typed commands,
> >                                 just done for the user automatically
> >                                 by the system.  So having them happen
> >                                 in a world where the public state
> >                                 wasn't sync'ed up to the private state
> >                                 ended up being very awkward.
> >
> >                                 I haven't looked at the launching code
> >                                 in detail recently so I am not sure
> >                                 whether it makes sense for the first
> >                                 stop to be handled as it is.
> >
> >                                 Jim
> >
> >
> >                                         On Aug 7, 2014, at 6:29 AM,
> >                                         Matthew Gardiner
> >                                         <mg11 at csr.com> wrote:
> >
> >                                         Hi Shawn,
> >
> >                                         I spent some time today
> >                                         looking at how to arrange for
> >                                         ShouldBroadcast to return
> >                                         false for this first stop. I
> >                                         managed to produce a quick
> >                                         hack for this (i.e. just
> >                                         counted the number of stops),
> >                                         but due to other distractions
> >                                         (from the rest of my job) I
> >                                         didn't get that far into
> >                                         discovering a nice way of
> >                                         achieving this...
> >
> >                                         What I did discover is that
> >                                         with my build just doing
> >                                         "process launch" results in 3
> >                                         stops (and 3 private resumes).
> >                                         That in itself I find
> >                                         surprising, since I was under
> >                                         the impression that I should
> >                                         see the inferior stop just
> >                                         once when exec is trapped by
> >                                         PTRACE_ME.
> >
> >                                         I then discovered that for
> >                                         each of these 3 stops
> >                                         ShouldBroadcast calls
> >                                         Thread::ShouldStop, the
> >                                         Thread::ShouldStop returns
> >                                         true for the first stop and
> >                                         false for the other 2. Looking
> >                                         into these behaviour
> >                                         differences I then found that
> >                                         from within Thread::ShouldStop
> >                                         we then call into the
> >                                         following:
> >
> >                                             StopInfoSP
> >                                         private_stop_info
> >                                         (GetPrivateStopInfo());
> >                                             if (private_stop_info &&
> >
>  private_stop_info->ShouldStopSynchronous(event_ptr) == false)
> >                                             {
> >
> >                                         and also
> >
> >                                         bool over_ride_stop =
> >
>  current_plan->ShouldAutoContinue(event_ptr);
> >
> >                                         the results from either of
> >                                         these, it seems providing the
> >                                         reasoning behind the different
> >                                         true/false returns. I'll
> >                                         return to spend a bit more
> >                                         time on this tomorrow. Let me
> >                                         know if you get any further on
> >                                         a similar vein!
> >
> >                                         thanks
> >                                         Matt
> >
> >
> >
> >                                         Shawn Best wrote:
> >                                                 Matt,
> >
> >                                                 I think you are
> >                                                 probably right,
> >                                                 although there are
> >                                                 other places where it
> >                                                 directly calls
> >                                                 SetPublicState().   I
> >                                                 was wondering about
> >                                                 the possibility there
> >                                                 could be other
> >                                                 listeners waiting for
> >                                                 a broadcast public
> >                                                 Stop event.  Is that a
> >                                                 possibility?
> >
> >                                                 Some others here were
> >                                                 investigating some
> >                                                 unit tests that were
> >                                                 failing intermittently
> >                                                 (StopHook).  Their
> >                                                 description of the
> >                                                 problem sounds
> >                                                 unrelated to the
> >                                                 launch code, but this
> >                                                 patch also magically
> >                                                 fixes that.
> >
> >                                                 Shawn.
> >
> >                                                 On 8/6/2014 6:26 AM,
> >                                                 Matthew Gardiner
> >                                                 wrote:
> >                                                         Shawn,
> >
> >                                                         Like I said
> >                                                         earlier your
> >                                                         patch worked.
> >                                                         However I
> >                                                         think the
> >                                                         right fix is
> >                                                         to arrange
> >                                                         that
> >                                                         ShouldBroadcast
> returns false for this first stop. I believe this, because firstly no stops
> should be reported here since the user is only interested in launching a
> program, and additionally because it enables us to fix lldb without
> removing the call to HandlePrivateEvent. This, I think, is important to
> preserve as the central point for process state change handling.
> >
> >                                                         Matt
> >
> >
> >
> >                                                         Shawn Best
> >                                                         wrote:
> >                                                                 Hi
> >                                                                 Matthew,
> >
> >                                                                 I have
> >                                                                 also
> >                                                                 been
> >                                                                 tracking
> this bug.  I believe there are other bugs in the unit tests failing
> indirectly because of this.  I also have a patch that will fix it, but was
> sitting on it until the other one landed.  These bugs do not show up on OSX
> since the inferiors are launched separately then attached to.
> >
> >                                                                 The
> >                                                                 first
> >                                                                 odd
> >                                                                 thing
> >                                                                 the
> >
>  launching code does is push an IOHandler when it sees the state transition
> to 'launching'. This is odd because I believe the launching program will
> always come up in a stopped state which will immediately pop the IOHandler.
> >
> >                                                                 At
> >                                                                 launch,
> the process comes up in the stopped state.  The launch code manually calls
> HandlePrivateEvent() with the stop event, which then broadcasts the Event.
> When HandleProcessEvent gets the public stop, it dumps out the current
> thread state just as if an executing inferior hit a breakpoint and stopped.
> >
> >                                                                 One
> >                                                                 way to
> >                                                                 fix
> >                                                                 this
> >                                                                 would
> >                                                                 be:
> >
> >                                                                 1.
> >                                                                 Don't
> >                                                                 push
> >                                                                 io
> >                                                                 handler
> when state is 'launching'
> >                                                                 2.
> >                                                                 Instead
> of manually calling HandlePrivateEvent, call SetPublicState().
> >
> >
>  Alternately, we could try and debug why ShouldBroadcast() returns true,
> but that appears to be by design since it is expecting the public stop
> event to pop the IOHandler that had been pushed when launching.
> >
> >                                                                 I have
> >                                                                 attached
> a patch demonstrating this.  In conjunction with the other patch for
> IOHandler race condition, it will fix a bunch of this kind of behaviour.
> >
> >                                                                 Shawn.
> >
> >                                                                 On
> >                                                                 8/5/2014
> 6:59 AM, Matthew Gardiner wrote:
> >
>  Jim,
> >
> >
>  I've been trying to debug an issue (I see it on 64-bit linux) where, I do
> "target create" and "process launch" and despite not requesting *stop at
> entry*, the first stop (which I believe is just the initial ptrace attach
> stop) is reported to the lldb command line. I added some fprintf to
> Process::HandlePrivateEvent, which counts the number of eStoppedState
> events seen and whether ShouldBroadcastEvent returns true for this event.
> Here's the output from my program with diagnostic:
> >
> >
>  (lldb) target create ~/me/i64-hello.elf
> >
>  Current executable set to '~/me/i64-hello.elf' (x86_64).
> >
>  (lldb) process launch
> >
>  MG Process::HandlePrivateEvent launching stopped_count 0 should_broadcast 1
> >
>  Process 31393 launching
> >
>  MG Process::HandlePrivateEvent stopped stopped_count 1 should_broadcast 1
> >
>  MG Process::HandlePrivateEvent running stopped_count 1 should_broadcast 1
> >
>  Process 31393 launched: 'i64-hello.elf' (x86_64)
> >
>  Process 31393 stopped
> >
>  * thread #1: tid = 31393, 0x0000003675a011f0, name = 'i64-hello.elf', stop
> reason = trace
> >
> >
>  frame #0: 0x0000003675a011f0
> >
>  -> 0x3675a011f0:  movq   %rsp, %rdi
> >
>   0x3675a011f3:  callq  0x3675a046e0
> >
>   0x3675a011f8:  movq   %rax, %r12
> >
>   0x3675a011fb:  movl   0x21eb97(%rip), %eax
> >
>  (lldb) MG Process::HandlePrivateEvent stopped stopped_count 2
> should_broadcast 0
> >
>  MG Process::HandlePrivateEvent running stopped_count 2 should_broadcast 0
> >
>  MG Process::HandlePrivateEvent stopped stopped_count 3 should_broadcast 0
> >
>  MG Process::HandlePrivateEvent running stopped_count 3 should_broadcast 0
> >
> >
>  In summary, lldb reports the inferior to be stopped (even though
> /proc/pid/status and lldb "target list" say it is running). Clearly this is
> wrong (hence my earlier post).
> >
> >
>  Am I correct in assuming that when  ShouldBroadcastEvent returns true this
> means that lldb should show this event to the debug user? (And thus hide
> other events where ShouldBroadcastEvent=false).
> >
> >
>  What puzzled me was why ShouldBroadcastEvent return true for this very
> first stop. Is this a bug?
> >
> >
>  I also spent sometime at ShouldBroadcastEvent and saw that this:
> >
> >
> >
>  case eStateStopped:
> >
> >
>  case eStateCrashed:
> >
> >
>  case eStateSuspended:
> >
> >                                                                         {
> >
>         ....
> >
> >
>  if (was_restarted || should_resume || m_resume_requested)
> >
> >                                                                         {
> >
> >
>  evaluates as false, and hence the PrivateResume code is not called... does
> this seem buggy to you for this very first stop?
> >
> >
>  I thought I'd try asking you, since in a previous mail from Greg, he cited
> you as being a thread-plan expert. (Hope that's ok!). I'd really appreciate
> your help in clarifying the above questions for me, and if you have time,
> giving me some ideas as to how to trace this one further e.g. how
> m_thread_list.ShouldStop and m_thread_list.ShouldReportStop should behave,
> etc.
> >
> >
>  thanks for your help
> >
>  Matt
> >
> >
>  Matthew Gardiner wrote:
> >
>        Folks,
> >
> >
>        In addition to the overlapping prompt race Shawn Best and myself are
> looking at, I'm seeing another issue where if I launch a process, I get a
> stop (presumably the in) being reported to the UI.
> >
> >
>        (lldb) target create ~/mydir/i64-hello.elf
> >
>        Current executable set to '~/mydir/i64-hello.elf' (x86_64).
> >
>        (lldb) process launch
> >
>        Process 27238 launching
> >
>        Process 27238 launched: '64-hello.elf' (x86_64)
> >
>        Process 27238 stopped
> >
>        * thread #1: tid = 27238, 0x0000003675a011f0, name = 'i64-hello.elf'
> >
> >
>        frame #0:
> >
>        (lldb) target list
> >
>        Current targets:
> >
>        * target #0: i64-hello.elf ( arch=x86_64-unknown-linux,
> platform=host, pid=27238, state=running )
> >
>        (lldb)
> >
> >
>        As you can see the "target list" reflects that the process is
> running. Which I confirmed by looking at /proc/27238/status.
> >
> >
>        Anyone else seeing this?
> >
> >
>        thanks
> >
>        Matt
> >
> >
> >
> >
>        Member of the CSR plc group of companies. CSR plc registered in
> England and Wales, registered number 4187346, registered office Churchill
> House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United
> Kingdom
> >
>        More information can be found at www.csr.com <http://www.csr.com>.
> Keep up to date with CSR on our technical blog, www.csr.com/blog <
> http://www.csr.com/blog>, CSR people blog, www.csr.com/people <
> http://www.csr.com/people>, YouTube, www.youtube.com/user/CSRplc <
> http://www.youtube.com/user/CSRplc>, Facebook,
> www.facebook.com/pages/CSR/191038434253534 <
> http://www.facebook.com/pages/CSR/191038434253534>, or follow us on
> Twitter at www.twitter.com/CSR_plc <http://www.twitter.com/CSR_plc>.
> >
>        New for 2014, you can now access the wide range of products powered
> by aptX at www.aptx.com <http://www.aptx.com>.
> >
>        _______________________________________________
> >
>        lldb-dev mailing list
> >
>        lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
> >
>        http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
>        To report this email as spam click
> https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==
> <
> https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==>
> .
> >
>  _______________________________________________
> >
>  lldb-dev mailing list
> >
> lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
> >
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >                         _______________________________________________
> >                         lldb-dev mailing list
> >                         lldb-dev at cs.uiuc.edu
> >
> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> >
>
>
>


-- 
Todd Fiala | Software Engineer | tfiala at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20141001/ed05b3b4/attachment.html>


More information about the lldb-dev mailing list