[lldb-dev] More linux process control and IOHandler races

Matthew Gardiner mg11 at csr.com
Wed Oct 1 22:12:32 PDT 2014


Yeah, cool. Getting these prompt issues sorted will put a smile on a lot
of faces :-)

On Wed, 2014-10-01 at 07:52 -0700, Todd Fiala wrote:
> Thanks, Matthew.
> 
> 
> I'll have a look at Shawn's changes and get them in after reviewing.
> 
> On Tue, Sep 30, 2014 at 11:43 PM, Matthew Gardiner <mg11 at csr.com>
> wrote:
>         Hi Shawn,
>         
>         It would be great for you to fix this and all the other lldb
>         prompt
>         bugs! Yes, lldb's rotten prompt it bothers me as much as you.
>         
>         My main problem with this issue, is that I am consuming all my
>         time with
>         trying to get CSRs kalimba chips debuggable by lldb.
>         Particularly
>         troublesome being those variants with non-8-bit bytes... I
>         wish I could
>         spend more this bug, but at the moment it is a peripheral
>         concern for
>         me.
>         
>         I have no pride associated with any previous fixes/bodges
>         (sleeps/m_first_stop etc.) that I've done to "address" this
>         issue. So
>         please feel free to solve this exactly as you feel fit.
>         
>         I'm guessing that Todd is in a much closer time-zone to you,
>         so it may
>         be better if he applies any of your patches for you. Sorry
>         Todd!
>         
>         thanks for the update
>         Matt
>         
>         On Mon, 2014-09-29 at 15:36 -0700, Shawn Best wrote:
>         > Hi All,
>         >
>         >
>         > I have spent some time digging into this again and
>         understand it a
>         > little better now.
>         >
>         >
>         > Matt, the reason your patch hangs on TestHello world, is
>         because the
>         > test script is launching the inferior program first then has
>         lldb do
>         > an attach.  This is a different code path than a regular
>         launch. The
>         > one-shot you put in there for 'm_first_stop' to prevent it
>         from
>         > broadcasting the first stop, causes it to never get
>         restarted.
>         >
>         >
>         > On OSX, the application is launched first and then lldb
>         attaches to
>         > it.  There is also a private stop message generated.  The
>         reason it is
>         > not broadcast, is because of a special class member
>         > 'm_resume_requested' and some special machinery ( class
>         > AttachCompletionHandler : public NextEventAction ) that is
>         used in the
>         > case of launching to set it.
>         >
>         >
>         > I could create a similar class "LaunchCompletionHandler" and
>         get that
>         > working to prevent the stop message from being broadcast.
>         >
>         >
>         > The simplest solution is still my original 2 line fix.
>         >
>         >
>         > I would like to get this fixed as it bothers me every time I
>         see the
>         > extra spew when debugging.  There were also some unit tests
>         failing on
>         > linux that this patch fixed.
>         >
>         >
>         > Shawn.
>         >
>         >
>         >
>         >
>         > On Mon, Aug 18, 2014 at 3:08 AM, Matthew Gardiner
>         <mg11 at csr.com>
>         > wrote:
>         >         Hi Shawn,
>         >
>         >         Have you spent anymore time looking at this prompt
>         issue
>         >         (where you see the (lldb) prompt even when you use
>         "process
>         >         launch")? I tried fixing it with the attached patch.
>         This
>         >         fixes the issue fine at the lldb command line, but
>         results in
>         >         the python TestHelloWorld test hanging the call to
>         >         Process::WaitForProcessToStop. I wondered if you had
>         spent
>         >         anymore time on this lately and knew of a
>         complete/better
>         >         fix...
>         >
>         >         thanks
>         >         Matt
>         >
>         >
>         >         Greg Clayton wrote:
>         >                 Jim and I have discussed the right thing to
>         do which
>         >                 we plan to implement as soon as we get the
>         chance.
>         >
>         >                 Current we have two process states: public
>         and
>         >                 private. The private state is always up to
>         date and is
>         >                 used to track what the process is currently
>         doing. The
>         >                 public state gets set when we believe the
>         end user
>         >                 should see an event that will update the GUI
>         or TTY
>         >                 with the current state if needed (stopped or
>         exited).
>         >                 There is a lot of tricky code that uses the
>         private
>         >                 process state to manage things like the
>         thread plans.
>         >                 Things like source level single step might
>         involve
>         >                 starting and stopping and hitting
>         breakpoints and
>         >                 single instruction stepping, but the user
>         only wants
>         >                 to see the final eStateStopped when the
>         source level
>         >                 step is done. Expressions are also tricky,
>         the public
>         >                 state of the process is stopped, but when we
>         run an
>         >                 expression, we might resume the process many
>         times and
>         >                 the user will always see that the public
>         state if
>         >                 stopped. Internally though the thread plans
>         are doing
>         >                 a whole bunch of stuff to the process and
>         the user
>         >                 never hears anything about these (no events
>         go
>         >                 public).
>         >
>         >                 So the current public and private stuff uses
>         >                 broadcaster hijacking and all sorts of other
>         tricks to
>         >                 avoid letting the end user know about the
>         starts and
>         >                 stops that they shouldn't know about. We
>         eventually
>         >                 want to have a new class, lets call in
>         >                 lldb_private::ProcessState for now, that
>         manages the
>         >                 current process state and always receives
>         the process
>         >                 events. There would be no more public and
>         private
>         >                 events. We would have one version of the
>         ProcessState
>         >                 subclass that would cause the GUI/TTY to
>         update
>         >                 (replacing the public events) and one
>         ProcessState
>         >                 subclass that would manage running thread
>         plans. There
>         >                 would be stack of these and this will help
>         us avoid
>         >                 the whole public/private/hijacking stuff we
>         have now.
>         >
>         >                 Launching a process would end up pushing a
>         new
>         >                 ProcessState class that could handle all the
>         necessary
>         >                 starts and stop we need to get the process
>         to a
>         >                 quiescent state that is presentable to the
>         user. The
>         >                 ProcessState subclass could eat all the
>         events it
>         >                 needs to while launching and then propagate
>         the
>         >                 stopped or running event when it gets popped
>         off the
>         >                 stack.
>         >
>         >                 So we should make some simple fixes for now
>         to work
>         >                 around the issue and get things working, but
>         it would
>         >                 also be great to get the new system up and
>         running so
>         >                 we can make this easier in the future for
>         people that
>         >                 want to make changes.
>         >
>         >                 Greg
>         >
>         >
>         >                         On Aug 8, 2014, at 1:08 AM, Matthew
>         Gardiner
>         >                         <mg11 at csr.com> wrote:
>         >
>         >                         Folks,
>         >
>         >                         Regarding "launching using the
>         shell",  I
>         >                         don't think applies in the buggy
>         case that
>         >                         myself and Shawn are looking at.
>         >
>         >                         I do:
>         >
>         >                         (lldb) target create ./test
>         >                         ...
>         >                         (lldb) process launch
>         >                         ...
>         >
>         >                         and when I inspected the call to
>         exec, I see
>         >                         that my exe name is the program
>         passed.
>         >
>         >                         Thanks for the insight into the
>         broadcasting
>         >                         of stop events. That explains why I
>         see in the
>         >                         ShouldBroadcastEvent, the ShouldStop
>         and
>         >                         ShouldReportStop calls.
>         >
>         >                         However, it would be nice to know if
>         the first
>         >                         stop event should be broadcast for
>         "process
>         >                         launch".  I think it's an
>         implementation
>         >                         detail, and therefore should not.
>         That would
>         >                         help to fix this issue.
>         >
>         >                         However, Shawn's original suggestion
>         to fix
>         >                         this issue circumvents the above
>         debate, by
>         >                         replacing the call of
>         HandlePrivateEvent with
>         >                         SetPublicState. So which fix is
>         best? Calling
>         >                         SetPublicState rather than
>         HandlePrivateEvent
>         >                         is certainly more expedient, and
>         avoids this
>         >                         debate... but is it more
>         >                         portable/future-proof?
>         >
>         >                         Matt
>         >
>         >
>         >                         jingham at apple.com wrote:
>         >                                 If you are launching using
>         the shell,
>         >                                 you'll see more stops before
>         you get
>         >                                 to the executable you are
>         actually
>         >                                 trying to launch.  In that
>         case,
>         >                                 instead of just running the
>         binary
>         >                                 directly you effectively do:
>         >
>         >                                 /bin/bash exec <binary> arg1
>         arg2
>         >
>         >                                 so that you can get bash (or
>         whatever
>         >                                 is set in $SHELL) to do the
>         argument
>         >                                 expansion for you.
>         >                                 GetResumeCountForLaunchInfo
>         calculates
>         >                                 this, then it is stuffed
>         into the
>         >                                 ProcessLaunchInfo
>         (SetResumeCount).
>         >                                 On Mac OS X we always let
>         the Platform
>         >                                 launch, then attach, so in
>         that case
>         >                                 the AttachCompletionHandler
>         does the
>         >                                 extra resumes.  I'm not all
>         that
>         >                                 familiar with how the Linux
>         side work,
>         >                                 but it also seems to use the
>         >                                 ProcessLaunchInfo's resume
>         count.
>         >
>         >                                 Note that in general in lldb
>         not all
>         >                                 publicly broadcast stop
>         messages are
>         >                                 going to result in a stop.
>         For
>         >                                 instance, all the breakpoint
>         command &
>         >                                 condition handling goes on
>         as a result
>         >                                 of the broadcast of the
>         public stop
>         >                                 event, but the process might
>         just turn
>         >                                 around an continue based on
>         that.
>         >                                 Whether to suppress the
>         broadcast and
>         >                                 continue from the private
>         state thread
>         >                                 or broadcast the event and
>         let the
>         >                                 upper levels of lldb take
>         care of what
>         >                                 happens from there on really
>         depends
>         >                                 on where it makes sense to
>         handle the
>         >                                 stop.  So for the case of
>         breakpoint
>         >                                 stops (or stop hooks,
>         another
>         >                                 example), those end up being
>         >                                 equivalent to user typed
>         commands,
>         >                                 just done for the user
>         automatically
>         >                                 by the system.  So having
>         them happen
>         >                                 in a world where the public
>         state
>         >                                 wasn't sync'ed up to the
>         private state
>         >                                 ended up being very awkward.
>         >
>         >                                 I haven't looked at the
>         launching code
>         >                                 in detail recently so I am
>         not sure
>         >                                 whether it makes sense for
>         the first
>         >                                 stop to be handled as it is.
>         >
>         >                                 Jim
>         >
>         >
>         >                                         On Aug 7, 2014, at
>         6:29 AM,
>         >                                         Matthew Gardiner
>         >                                         <mg11 at csr.com>
>         wrote:
>         >
>         >                                         Hi Shawn,
>         >
>         >                                         I spent some time
>         today
>         >                                         looking at how to
>         arrange for
>         >                                         ShouldBroadcast to
>         return
>         >                                         false for this first
>         stop. I
>         >                                         managed to produce a
>         quick
>         >                                         hack for this (i.e.
>         just
>         >                                         counted the number
>         of stops),
>         >                                         but due to other
>         distractions
>         >                                         (from the rest of my
>         job) I
>         >                                         didn't get that far
>         into
>         >                                         discovering a nice
>         way of
>         >                                         achieving this...
>         >
>         >                                         What I did discover
>         is that
>         >                                         with my build just
>         doing
>         >                                         "process launch"
>         results in 3
>         >                                         stops (and 3 private
>         resumes).
>         >                                         That in itself I
>         find
>         >                                         surprising, since I
>         was under
>         >                                         the impression that
>         I should
>         >                                         see the inferior
>         stop just
>         >                                         once when exec is
>         trapped by
>         >                                         PTRACE_ME.
>         >
>         >                                         I then discovered
>         that for
>         >                                         each of these 3
>         stops
>         >                                         ShouldBroadcast
>         calls
>         >                                         Thread::ShouldStop,
>         the
>         >                                         Thread::ShouldStop
>         returns
>         >                                         true for the first
>         stop and
>         >                                         false for the other
>         2. Looking
>         >                                         into these behaviour
>         >                                         differences I then
>         found that
>         >                                         from within
>         Thread::ShouldStop
>         >                                         we then call into
>         the
>         >                                         following:
>         >
>         >                                             StopInfoSP
>         >                                         private_stop_info
>         >
>          (GetPrivateStopInfo());
>         >                                             if
>         (private_stop_info &&
>         >
>          private_stop_info->ShouldStopSynchronous(event_ptr) == false)
>         >                                             {
>         >
>         >                                         and also
>         >
>         >                                         bool over_ride_stop
>         =
>         >
>          current_plan->ShouldAutoContinue(event_ptr);
>         >
>         >                                         the results from
>         either of
>         >                                         these, it seems
>         providing the
>         >                                         reasoning behind the
>         different
>         >                                         true/false returns.
>         I'll
>         >                                         return to spend a
>         bit more
>         >                                         time on this
>         tomorrow. Let me
>         >                                         know if you get any
>         further on
>         >                                         a similar vein!
>         >
>         >                                         thanks
>         >                                         Matt
>         >
>         >
>         >
>         >                                         Shawn Best wrote:
>         >                                                 Matt,
>         >
>         >                                                 I think you
>         are
>         >                                                 probably
>         right,
>         >                                                 although
>         there are
>         >                                                 other places
>         where it
>         >                                                 directly
>         calls
>         >
>          SetPublicState().   I
>         >                                                 was
>         wondering about
>         >                                                 the
>         possibility there
>         >                                                 could be
>         other
>         >                                                 listeners
>         waiting for
>         >                                                 a broadcast
>         public
>         >                                                 Stop event.
>         Is that a
>         >                                                 possibility?
>         >
>         >                                                 Some others
>         here were
>         >
>          investigating some
>         >                                                 unit tests
>         that were
>         >                                                 failing
>         intermittently
>         >                                                 (StopHook).
>         Their
>         >                                                 description
>         of the
>         >                                                 problem
>         sounds
>         >                                                 unrelated to
>         the
>         >                                                 launch code,
>         but this
>         >                                                 patch also
>         magically
>         >                                                 fixes that.
>         >
>         >                                                 Shawn.
>         >
>         >                                                 On 8/6/2014
>         6:26 AM,
>         >                                                 Matthew
>         Gardiner
>         >                                                 wrote:
>         >
>          Shawn,
>         >
>         >                                                         Like
>         I said
>         >
>          earlier your
>         >
>          patch worked.
>         >
>          However I
>         >
>          think the
>         >
>          right fix is
>         >                                                         to
>         arrange
>         >                                                         that
>         >
>          ShouldBroadcast returns false for this first stop. I believe
>         this, because firstly no stops should be reported here since
>         the user is only interested in launching a program, and
>         additionally because it enables us to fix lldb without
>         removing the call to HandlePrivateEvent. This, I think, is
>         important to preserve as the central point for process state
>         change handling.
>         >
>         >                                                         Matt
>         >
>         >
>         >
>         >
>          Shawn Best
>         >
>          wrote:
>         >
>              Hi
>         >
>              Matthew,
>         >
>         >
>              I have
>         >
>              also
>         >
>              been
>         >
>              tracking this bug.  I believe there are other bugs in the
>         unit tests failing indirectly because of this.  I also have a
>         patch that will fix it, but was sitting on it until the other
>         one landed.  These bugs do not show up on OSX since the
>         inferiors are launched separately then attached to.
>         >
>         >
>              The
>         >
>              first
>         >
>              odd
>         >
>              thing
>         >
>              the
>         >
>              launching code does is push an IOHandler when it sees the
>         state transition to 'launching'. This is odd because I believe
>         the launching program will always come up in a stopped state
>         which will immediately pop the IOHandler.
>         >
>         >
>              At
>         >
>              launch, the process comes up in the stopped state.  The
>         launch code manually calls HandlePrivateEvent() with the stop
>         event, which then broadcasts the Event.  When
>         HandleProcessEvent gets the public stop, it dumps out the
>         current thread state just as if an executing inferior hit a
>         breakpoint and stopped.
>         >
>         >
>              One
>         >
>              way to
>         >
>              fix
>         >
>              this
>         >
>              would
>         >
>              be:
>         >
>         >
>              1.
>         >
>              Don't
>         >
>              push
>         >
>              io
>         >
>              handler when state is 'launching'
>         >
>              2.
>         >
>              Instead of manually calling HandlePrivateEvent, call
>         SetPublicState().
>         >
>         >
>              Alternately, we could try and debug why ShouldBroadcast()
>         returns true, but that appears to be by design since it is
>         expecting the public stop event to pop the IOHandler that had
>         been pushed when launching.
>         >
>         >
>              I have
>         >
>              attached a patch demonstrating this.  In conjunction with
>         the other patch for IOHandler race condition, it will fix a
>         bunch of this kind of behaviour.
>         >
>         >
>              Shawn.
>         >
>         >
>              On
>         >
>              8/5/2014 6:59 AM, Matthew Gardiner wrote:
>         >
>                      Jim,
>         >
>         >
>                      I've been trying to debug an issue (I see it on
>         64-bit linux) where, I do "target create" and "process launch"
>         and despite not requesting *stop at entry*, the first stop
>         (which I believe is just the initial ptrace attach stop) is
>         reported to the lldb command line. I added some fprintf to
>         Process::HandlePrivateEvent, which counts the number of
>         eStoppedState events seen and whether ShouldBroadcastEvent
>         returns true for this event. Here's the output from my program
>         with diagnostic:
>         >
>         >
>                      (lldb) target create ~/me/i64-hello.elf
>         >
>                      Current executable set to
>         '~/me/i64-hello.elf' (x86_64).
>         >
>                      (lldb) process launch
>         >
>                      MG Process::HandlePrivateEvent launching
>         stopped_count 0 should_broadcast 1
>         >
>                      Process 31393 launching
>         >
>                      MG Process::HandlePrivateEvent stopped
>         stopped_count 1 should_broadcast 1
>         >
>                      MG Process::HandlePrivateEvent running
>         stopped_count 1 should_broadcast 1
>         >
>                      Process 31393 launched: 'i64-hello.elf' (x86_64)
>         >
>                      Process 31393 stopped
>         >
>                      * thread #1: tid = 31393, 0x0000003675a011f0,
>         name = 'i64-hello.elf', stop reason = trace
>         >
>         >
>                      frame #0: 0x0000003675a011f0
>         >
>                      -> 0x3675a011f0:  movq   %rsp, %rdi
>         >
>         0x3675a011f3:  callq  0x3675a046e0
>         >
>         0x3675a011f8:  movq   %rax, %r12
>         >
>         0x3675a011fb:  movl   0x21eb97(%rip), %eax
>         >
>                      (lldb) MG Process::HandlePrivateEvent stopped
>         stopped_count 2 should_broadcast 0
>         >
>                      MG Process::HandlePrivateEvent running
>         stopped_count 2 should_broadcast 0
>         >
>                      MG Process::HandlePrivateEvent stopped
>         stopped_count 3 should_broadcast 0
>         >
>                      MG Process::HandlePrivateEvent running
>         stopped_count 3 should_broadcast 0
>         >
>         >
>                      In summary, lldb reports the inferior to be
>         stopped (even though /proc/pid/status and lldb "target list"
>         say it is running). Clearly this is wrong (hence my earlier
>         post).
>         >
>         >
>                      Am I correct in assuming that when
>         ShouldBroadcastEvent returns true this means that lldb should
>         show this event to the debug user? (And thus hide other events
>         where ShouldBroadcastEvent=false).
>         >
>         >
>                      What puzzled me was why ShouldBroadcastEvent
>         return true for this very first stop. Is this a bug?
>         >
>         >
>                      I also spent sometime at ShouldBroadcastEvent and
>         saw that this:
>         >
>         >
>         >
>                      case eStateStopped:
>         >
>         >
>                      case eStateCrashed:
>         >
>         >
>                      case eStateSuspended:
>         >
>         >
>                      {
>         >
>                               ....
>         >
>         >
>                      if (was_restarted || should_resume ||
>         m_resume_requested)
>         >
>         >
>                      {
>         >
>         >
>                      evaluates as false, and hence the PrivateResume
>         code is not called... does this seem buggy to you for this
>         very first stop?
>         >
>         >
>                      I thought I'd try asking you, since in a previous
>         mail from Greg, he cited you as being a thread-plan expert.
>         (Hope that's ok!). I'd really appreciate your help in
>         clarifying the above questions for me, and if you have time,
>         giving me some ideas as to how to trace this one further e.g.
>         how m_thread_list.ShouldStop and
>         m_thread_list.ShouldReportStop should behave, etc.
>         >
>         >
>                      thanks for your help
>         >
>                      Matt
>         >
>         >
>                      Matthew Gardiner wrote:
>         >
>                              Folks,
>         >
>         >
>                              In addition to the overlapping prompt
>         race Shawn Best and myself are looking at, I'm seeing another
>         issue where if I launch a process, I get a stop (presumably
>         the in) being reported to the UI.
>         >
>         >
>                              (lldb) target create
>         ~/mydir/i64-hello.elf
>         >
>                              Current executable set to
>         '~/mydir/i64-hello.elf' (x86_64).
>         >
>                              (lldb) process launch
>         >
>                              Process 27238 launching
>         >
>                              Process 27238 launched:
>         '64-hello.elf' (x86_64)
>         >
>                              Process 27238 stopped
>         >
>                              * thread #1: tid = 27238,
>         0x0000003675a011f0, name = 'i64-hello.elf'
>         >
>         >
>                              frame #0:
>         >
>                              (lldb) target list
>         >
>                              Current targets:
>         >
>                              * target #0: i64-hello.elf
>         ( arch=x86_64-unknown-linux, platform=host, pid=27238,
>         state=running )
>         >
>                              (lldb)
>         >
>         >
>                              As you can see the "target list" reflects
>         that the process is running. Which I confirmed by looking
>         at /proc/27238/status.
>         >
>         >
>                              Anyone else seeing this?
>         >
>         >
>                              thanks
>         >
>                              Matt
>         >
>         >
>         >
>         >
>                              Member of the CSR plc group of companies.
>         CSR plc registered in England and Wales, registered number
>         4187346, registered office Churchill House, Cambridge Business
>         Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
>         >
>                              More information can be found at
>         www.csr.com <http://www.csr.com>. Keep up to date with CSR on
>         our technical blog, www.csr.com/blog
>         <http://www.csr.com/blog>, CSR people blog, www.csr.com/people
>         <http://www.csr.com/people>, YouTube,
>         www.youtube.com/user/CSRplc
>         <http://www.youtube.com/user/CSRplc>, Facebook,
>         www.facebook.com/pages/CSR/191038434253534
>         <http://www.facebook.com/pages/CSR/191038434253534>, or follow
>         us on Twitter at www.twitter.com/CSR_plc
>         <http://www.twitter.com/CSR_plc>.
>         >
>                              New for 2014, you can now access the wide
>         range of products powered by aptX at www.aptx.com
>         <http://www.aptx.com>.
>         >
>         
>          _______________________________________________
>         >
>                              lldb-dev mailing list
>         >
>                              lldb-dev at cs.uiuc.edu
>         <mailto:lldb-dev at cs.uiuc.edu>
>         >
>         
>          http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>         >
>         >
>         >
>                              To report this email as spam click
>         https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!
>         GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw== <https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==> .
>         >
>                      _______________________________________________
>         >
>                      lldb-dev mailing list
>         >
>                      lldb-dev at cs.uiuc.edu
>         <mailto:lldb-dev at cs.uiuc.edu>
>         >
>         
>          http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>         >
>          _______________________________________________
>         >                         lldb-dev mailing list
>         >                         lldb-dev at cs.uiuc.edu
>         >
>          http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>         >
>         >
>         >
>         >
>         
>         
>         
> 
> 
> 
> 
> -- 
> Todd Fiala |
>  Software
> Engineer |
>  tfiala at google.com
> 
> 
> 
> 





More information about the lldb-dev mailing list