[lldb-dev] More linux process control and IOHandler races

Matthew Gardiner mg11 at csr.com
Tue Sep 30 23:43:58 PDT 2014


Hi Shawn,

It would be great for you to fix this and all the other lldb prompt
bugs! Yes, lldb's rotten prompt it bothers me as much as you.

My main problem with this issue, is that I am consuming all my time with
trying to get CSRs kalimba chips debuggable by lldb. Particularly
troublesome being those variants with non-8-bit bytes... I wish I could
spend more this bug, but at the moment it is a peripheral concern for
me.

I have no pride associated with any previous fixes/bodges
(sleeps/m_first_stop etc.) that I've done to "address" this issue. So
please feel free to solve this exactly as you feel fit.

I'm guessing that Todd is in a much closer time-zone to you, so it may
be better if he applies any of your patches for you. Sorry Todd!

thanks for the update
Matt

On Mon, 2014-09-29 at 15:36 -0700, Shawn Best wrote:
> Hi All,
> 
> 
> I have spent some time digging into this again and understand it a
> little better now.
> 
> 
> Matt, the reason your patch hangs on TestHello world, is because the
> test script is launching the inferior program first then has lldb do
> an attach.  This is a different code path than a regular launch. The
> one-shot you put in there for 'm_first_stop' to prevent it from
> broadcasting the first stop, causes it to never get restarted.
> 
> 
> On OSX, the application is launched first and then lldb attaches to
> it.  There is also a private stop message generated.  The reason it is
> not broadcast, is because of a special class member
> 'm_resume_requested' and some special machinery ( class
> AttachCompletionHandler : public NextEventAction ) that is used in the
> case of launching to set it.
> 
> 
> I could create a similar class "LaunchCompletionHandler" and get that
> working to prevent the stop message from being broadcast.
> 
> 
> The simplest solution is still my original 2 line fix.
> 
> 
> I would like to get this fixed as it bothers me every time I see the
> extra spew when debugging.  There were also some unit tests failing on
> linux that this patch fixed.
> 
> 
> Shawn.
> 
> 
> 
> 
> On Mon, Aug 18, 2014 at 3:08 AM, Matthew Gardiner <mg11 at csr.com>
> wrote:
>         Hi Shawn,
>         
>         Have you spent anymore time looking at this prompt issue
>         (where you see the (lldb) prompt even when you use "process
>         launch")? I tried fixing it with the attached patch. This
>         fixes the issue fine at the lldb command line, but results in
>         the python TestHelloWorld test hanging the call to
>         Process::WaitForProcessToStop. I wondered if you had spent
>         anymore time on this lately and knew of a complete/better
>         fix...
>         
>         thanks
>         Matt
>         
>         
>         Greg Clayton wrote:
>                 Jim and I have discussed the right thing to do which
>                 we plan to implement as soon as we get the chance.
>                 
>                 Current we have two process states: public and
>                 private. The private state is always up to date and is
>                 used to track what the process is currently doing. The
>                 public state gets set when we believe the end user
>                 should see an event that will update the GUI or TTY
>                 with the current state if needed (stopped or exited).
>                 There is a lot of tricky code that uses the private
>                 process state to manage things like the thread plans.
>                 Things like source level single step might involve
>                 starting and stopping and hitting breakpoints and
>                 single instruction stepping, but the user only wants
>                 to see the final eStateStopped when the source level
>                 step is done. Expressions are also tricky, the public
>                 state of the process is stopped, but when we run an
>                 expression, we might resume the process many times and
>                 the user will always see that the public state if
>                 stopped. Internally though the thread plans are doing
>                 a whole bunch of stuff to the process and the user
>                 never hears anything about these (no events go
>                 public).
>                 
>                 So the current public and private stuff uses
>                 broadcaster hijacking and all sorts of other tricks to
>                 avoid letting the end user know about the starts and
>                 stops that they shouldn't know about. We eventually
>                 want to have a new class, lets call in
>                 lldb_private::ProcessState for now, that manages the
>                 current process state and always receives the process
>                 events. There would be no more public and private
>                 events. We would have one version of the ProcessState
>                 subclass that would cause the GUI/TTY to update
>                 (replacing the public events) and one ProcessState
>                 subclass that would manage running thread plans. There
>                 would be stack of these and this will help us avoid
>                 the whole public/private/hijacking stuff we have now.
>                 
>                 Launching a process would end up pushing a new
>                 ProcessState class that could handle all the necessary
>                 starts and stop we need to get the process to a
>                 quiescent state that is presentable to the user. The
>                 ProcessState subclass could eat all the events it
>                 needs to while launching and then propagate the
>                 stopped or running event when it gets popped off the
>                 stack.
>                 
>                 So we should make some simple fixes for now to work
>                 around the issue and get things working, but it would
>                 also be great to get the new system up and running so
>                 we can make this easier in the future for people that
>                 want to make changes.
>                 
>                 Greg
>                 
>                 
>                         On Aug 8, 2014, at 1:08 AM, Matthew Gardiner
>                         <mg11 at csr.com> wrote:
>                         
>                         Folks,
>                         
>                         Regarding "launching using the shell",  I
>                         don't think applies in the buggy case that
>                         myself and Shawn are looking at.
>                         
>                         I do:
>                         
>                         (lldb) target create ./test
>                         ...
>                         (lldb) process launch
>                         ...
>                         
>                         and when I inspected the call to exec, I see
>                         that my exe name is the program passed.
>                         
>                         Thanks for the insight into the broadcasting
>                         of stop events. That explains why I see in the
>                         ShouldBroadcastEvent, the ShouldStop and
>                         ShouldReportStop calls.
>                         
>                         However, it would be nice to know if the first
>                         stop event should be broadcast for "process
>                         launch".  I think it's an implementation
>                         detail, and therefore should not. That would
>                         help to fix this issue.
>                         
>                         However, Shawn's original suggestion to fix
>                         this issue circumvents the above debate, by
>                         replacing the call of HandlePrivateEvent with
>                         SetPublicState. So which fix is best? Calling
>                         SetPublicState rather than HandlePrivateEvent
>                         is certainly more expedient, and avoids this
>                         debate... but is it more
>                         portable/future-proof?
>                         
>                         Matt
>                         
>                         
>                         jingham at apple.com wrote:
>                                 If you are launching using the shell,
>                                 you'll see more stops before you get
>                                 to the executable you are actually
>                                 trying to launch.  In that case,
>                                 instead of just running the binary
>                                 directly you effectively do:
>                                 
>                                 /bin/bash exec <binary> arg1 arg2
>                                 
>                                 so that you can get bash (or whatever
>                                 is set in $SHELL) to do the argument
>                                 expansion for you.
>                                 GetResumeCountForLaunchInfo calculates
>                                 this, then it is stuffed into the
>                                 ProcessLaunchInfo (SetResumeCount).
>                                 On Mac OS X we always let the Platform
>                                 launch, then attach, so in that case
>                                 the AttachCompletionHandler does the
>                                 extra resumes.  I'm not all that
>                                 familiar with how the Linux side work,
>                                 but it also seems to use the
>                                 ProcessLaunchInfo's resume count.
>                                 
>                                 Note that in general in lldb not all
>                                 publicly broadcast stop messages are
>                                 going to result in a stop.  For
>                                 instance, all the breakpoint command &
>                                 condition handling goes on as a result
>                                 of the broadcast of the public stop
>                                 event, but the process might just turn
>                                 around an continue based on that.
>                                 Whether to suppress the broadcast and
>                                 continue from the private state thread
>                                 or broadcast the event and let the
>                                 upper levels of lldb take care of what
>                                 happens from there on really depends
>                                 on where it makes sense to handle the
>                                 stop.  So for the case of breakpoint
>                                 stops (or stop hooks, another
>                                 example), those end up being
>                                 equivalent to user typed commands,
>                                 just done for the user automatically
>                                 by the system.  So having them happen
>                                 in a world where the public state
>                                 wasn't sync'ed up to the private state
>                                 ended up being very awkward.
>                                 
>                                 I haven't looked at the launching code
>                                 in detail recently so I am not sure
>                                 whether it makes sense for the first
>                                 stop to be handled as it is.
>                                 
>                                 Jim
>                                 
>                                 
>                                         On Aug 7, 2014, at 6:29 AM,
>                                         Matthew Gardiner
>                                         <mg11 at csr.com> wrote:
>                                         
>                                         Hi Shawn,
>                                         
>                                         I spent some time today
>                                         looking at how to arrange for
>                                         ShouldBroadcast to return
>                                         false for this first stop. I
>                                         managed to produce a quick
>                                         hack for this (i.e. just
>                                         counted the number of stops),
>                                         but due to other distractions
>                                         (from the rest of my job) I
>                                         didn't get that far into
>                                         discovering a nice way of
>                                         achieving this...
>                                         
>                                         What I did discover is that
>                                         with my build just doing
>                                         "process launch" results in 3
>                                         stops (and 3 private resumes).
>                                         That in itself I find
>                                         surprising, since I was under
>                                         the impression that I should
>                                         see the inferior stop just
>                                         once when exec is trapped by
>                                         PTRACE_ME.
>                                         
>                                         I then discovered that for
>                                         each of these 3 stops
>                                         ShouldBroadcast calls
>                                         Thread::ShouldStop, the
>                                         Thread::ShouldStop returns
>                                         true for the first stop and
>                                         false for the other 2. Looking
>                                         into these behaviour
>                                         differences I then found that
>                                         from within Thread::ShouldStop
>                                         we then call into the
>                                         following:
>                                         
>                                             StopInfoSP
>                                         private_stop_info
>                                         (GetPrivateStopInfo());
>                                             if (private_stop_info &&
>                                         private_stop_info->ShouldStopSynchronous(event_ptr) == false)
>                                             {
>                                         
>                                         and also
>                                         
>                                         bool over_ride_stop =
>                                         current_plan->ShouldAutoContinue(event_ptr);
>                                         
>                                         the results from either of
>                                         these, it seems providing the
>                                         reasoning behind the different
>                                         true/false returns. I'll
>                                         return to spend a bit more
>                                         time on this tomorrow. Let me
>                                         know if you get any further on
>                                         a similar vein!
>                                         
>                                         thanks
>                                         Matt
>                                         
>                                         
>                                         
>                                         Shawn Best wrote:
>                                                 Matt,
>                                                 
>                                                 I think you are
>                                                 probably right,
>                                                 although there are
>                                                 other places where it
>                                                 directly calls
>                                                 SetPublicState().   I
>                                                 was wondering about
>                                                 the possibility there
>                                                 could be other
>                                                 listeners waiting for
>                                                 a broadcast public
>                                                 Stop event.  Is that a
>                                                 possibility?
>                                                 
>                                                 Some others here were
>                                                 investigating some
>                                                 unit tests that were
>                                                 failing intermittently
>                                                 (StopHook).  Their
>                                                 description of the
>                                                 problem sounds
>                                                 unrelated to the
>                                                 launch code, but this
>                                                 patch also magically
>                                                 fixes that.
>                                                 
>                                                 Shawn.
>                                                 
>                                                 On 8/6/2014 6:26 AM,
>                                                 Matthew Gardiner
>                                                 wrote:
>                                                         Shawn,
>                                                         
>                                                         Like I said
>                                                         earlier your
>                                                         patch worked.
>                                                         However I
>                                                         think the
>                                                         right fix is
>                                                         to arrange
>                                                         that
>                                                         ShouldBroadcast returns false for this first stop. I believe this, because firstly no stops should be reported here since the user is only interested in launching a program, and additionally because it enables us to fix lldb without removing the call to HandlePrivateEvent. This, I think, is important to preserve as the central point for process state change handling.
>                                                         
>                                                         Matt
>                                                         
>                                                         
>                                                         
>                                                         Shawn Best
>                                                         wrote:
>                                                                 Hi
>                                                                 Matthew,
>                                                                 
>                                                                 I have
>                                                                 also
>                                                                 been
>                                                                 tracking this bug.  I believe there are other bugs in the unit tests failing indirectly because of this.  I also have a patch that will fix it, but was sitting on it until the other one landed.  These bugs do not show up on OSX since the inferiors are launched separately then attached to.
>                                                                 
>                                                                 The
>                                                                 first
>                                                                 odd
>                                                                 thing
>                                                                 the
>                                                                 launching code does is push an IOHandler when it sees the state transition to 'launching'. This is odd because I believe the launching program will always come up in a stopped state which will immediately pop the IOHandler.
>                                                                 
>                                                                 At
>                                                                 launch, the process comes up in the stopped state.  The launch code manually calls HandlePrivateEvent() with the stop event, which then broadcasts the Event.  When HandleProcessEvent gets the public stop, it dumps out the current thread state just as if an executing inferior hit a breakpoint and stopped.
>                                                                 
>                                                                 One
>                                                                 way to
>                                                                 fix
>                                                                 this
>                                                                 would
>                                                                 be:
>                                                                 
>                                                                 1.
>                                                                 Don't
>                                                                 push
>                                                                 io
>                                                                 handler when state is 'launching'
>                                                                 2.
>                                                                 Instead of manually calling HandlePrivateEvent, call SetPublicState().
>                                                                 
>                                                                 Alternately, we could try and debug why ShouldBroadcast() returns true, but that appears to be by design since it is expecting the public stop event to pop the IOHandler that had been pushed when launching.
>                                                                 
>                                                                 I have
>                                                                 attached a patch demonstrating this.  In conjunction with the other patch for IOHandler race condition, it will fix a bunch of this kind of behaviour.
>                                                                 
>                                                                 Shawn.
>                                                                 
>                                                                 On
>                                                                 8/5/2014 6:59 AM, Matthew Gardiner wrote:
>                                                                         Jim,
>                                                                         
>                                                                         I've been trying to debug an issue (I see it on 64-bit linux) where, I do "target create" and "process launch" and despite not requesting *stop at entry*, the first stop (which I believe is just the initial ptrace attach stop) is reported to the lldb command line. I added some fprintf to Process::HandlePrivateEvent, which counts the number of eStoppedState events seen and whether ShouldBroadcastEvent returns true for this event. Here's the output from my program with diagnostic:
>                                                                         
>                                                                         (lldb) target create ~/me/i64-hello.elf
>                                                                         Current executable set to '~/me/i64-hello.elf' (x86_64).
>                                                                         (lldb) process launch
>                                                                         MG Process::HandlePrivateEvent launching stopped_count 0 should_broadcast 1
>                                                                         Process 31393 launching
>                                                                         MG Process::HandlePrivateEvent stopped stopped_count 1 should_broadcast 1
>                                                                         MG Process::HandlePrivateEvent running stopped_count 1 should_broadcast 1
>                                                                         Process 31393 launched: 'i64-hello.elf' (x86_64)
>                                                                         Process 31393 stopped
>                                                                         * thread #1: tid = 31393, 0x0000003675a011f0, name = 'i64-hello.elf', stop reason = trace
>                                                                         
>                                                                         frame #0: 0x0000003675a011f0
>                                                                         -> 0x3675a011f0:  movq   %rsp, %rdi
>                                                                            0x3675a011f3:  callq  0x3675a046e0
>                                                                            0x3675a011f8:  movq   %rax, %r12
>                                                                            0x3675a011fb:  movl   0x21eb97(%rip), %eax
>                                                                         (lldb) MG Process::HandlePrivateEvent stopped stopped_count 2 should_broadcast 0
>                                                                         MG Process::HandlePrivateEvent running stopped_count 2 should_broadcast 0
>                                                                         MG Process::HandlePrivateEvent stopped stopped_count 3 should_broadcast 0
>                                                                         MG Process::HandlePrivateEvent running stopped_count 3 should_broadcast 0
>                                                                         
>                                                                         In summary, lldb reports the inferior to be stopped (even though /proc/pid/status and lldb "target list" say it is running). Clearly this is wrong (hence my earlier post).
>                                                                         
>                                                                         Am I correct in assuming that when  ShouldBroadcastEvent returns true this means that lldb should show this event to the debug user? (And thus hide other events where ShouldBroadcastEvent=false).
>                                                                         
>                                                                         What puzzled me was why ShouldBroadcastEvent return true for this very first stop. Is this a bug?
>                                                                         
>                                                                         I also spent sometime at ShouldBroadcastEvent and saw that this:
>                                                                         
>                                                                         
>                                                                         case eStateStopped:
>                                                                         
>                                                                         case eStateCrashed:
>                                                                         
>                                                                         case eStateSuspended:
>                                                                         
>                                                                         {
>                                                                                  ....
>                                                                         
>                                                                         if (was_restarted || should_resume || m_resume_requested)
>                                                                         
>                                                                         {
>                                                                         
>                                                                         evaluates as false, and hence the PrivateResume code is not called... does this seem buggy to you for this very first stop?
>                                                                         
>                                                                         I thought I'd try asking you, since in a previous mail from Greg, he cited you as being a thread-plan expert. (Hope that's ok!). I'd really appreciate your help in clarifying the above questions for me, and if you have time, giving me some ideas as to how to trace this one further e.g. how m_thread_list.ShouldStop and m_thread_list.ShouldReportStop should behave, etc.
>                                                                         
>                                                                         thanks for your help
>                                                                         Matt
>                                                                         
>                                                                         Matthew Gardiner wrote:
>                                                                                 Folks,
>                                                                                 
>                                                                                 In addition to the overlapping prompt race Shawn Best and myself are looking at, I'm seeing another issue where if I launch a process, I get a stop (presumably the in) being reported to the UI.
>                                                                                 
>                                                                                 (lldb) target create ~/mydir/i64-hello.elf
>                                                                                 Current executable set to '~/mydir/i64-hello.elf' (x86_64).
>                                                                                 (lldb) process launch
>                                                                                 Process 27238 launching
>                                                                                 Process 27238 launched: '64-hello.elf' (x86_64)
>                                                                                 Process 27238 stopped
>                                                                                 * thread #1: tid = 27238, 0x0000003675a011f0, name = 'i64-hello.elf'
>                                                                                 
>                                                                                 frame #0:
>                                                                                 (lldb) target list
>                                                                                 Current targets:
>                                                                                 * target #0: i64-hello.elf ( arch=x86_64-unknown-linux, platform=host, pid=27238, state=running )
>                                                                                 (lldb)
>                                                                                 
>                                                                                 As you can see the "target list" reflects that the process is running. Which I confirmed by looking at /proc/27238/status.
>                                                                                 
>                                                                                 Anyone else seeing this?
>                                                                                 
>                                                                                 thanks
>                                                                                 Matt
>                                                                                 
>                                                                                 
>                                                                                 
>                                                                                 Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
>                                                                                 More information can be found at www.csr.com <http://www.csr.com>. Keep up to date with CSR on our technical blog, www.csr.com/blog <http://www.csr.com/blog>, CSR people blog, www.csr.com/people <http://www.csr.com/people>, YouTube, www.youtube.com/user/CSRplc <http://www.youtube.com/user/CSRplc>, Facebook, www.facebook.com/pages/CSR/191038434253534 <http://www.facebook.com/pages/CSR/191038434253534>, or follow us on Twitter at www.twitter.com/CSR_plc <http://www.twitter.com/CSR_plc>.
>                                                                                 New for 2014, you can now access the wide range of products powered by aptX at www.aptx.com <http://www.aptx.com>.
>                                                                                 _______________________________________________
>                                                                                 lldb-dev mailing list
>                                                                                 lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
>                                                                                 http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>                                                                                 
>                                                                                 
>                                                                                 To report this email as spam click https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw== <https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==> .
>                                                                         _______________________________________________
>                                                                         lldb-dev mailing list
>                                                                         lldb-dev at cs.uiuc.edu <mailto:lldb-dev at cs.uiuc.edu>
>                                                                         http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>                         _______________________________________________
>                         lldb-dev mailing list
>                         lldb-dev at cs.uiuc.edu
>                         http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>         
>         
> 
> 





More information about the lldb-dev mailing list