[lldb-dev] Testing through api vs. commands

Wed Oct 7 11:40:04 PDT 2015

On Wed, Oct 7, 2015 at 11:26 AM Jim Ingham <jingham at apple.com> wrote:

>
> > On Oct 7, 2015, at 11:16 AM, Jim Ingham <jingham at apple.com> wrote:
> >
> >>
> >> On Oct 7, 2015, at 10:37 AM, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >>
> >>
> >>
> >> On Wed, Oct 7, 2015 at 10:17 AM Greg Clayton <gclayton at apple.com>
> wrote:
> >>
> >>
> >>> On Oct 7, 2015, at 10:05 AM, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >>>
> >>> Jim, Greg,
> >>>
> >>> Can I get some feedback on this?  I would like to start enforcing this
> moving forward.  I want to make sure we're in agreement.
> >>>
> >>> On Mon, Oct 5, 2015 at 12:30 PM Todd Fiala <todd.fiala at gmail.com>
> wrote:
> >>> IMHO that all sounds reasonable.
> >>>
> >>> FWIW - I wrote some tests for the test system changes I put in (for
> the pure-python impl of timeout support), and in the process, I discovered
> a race condition in using a python facility that there really is no way I
> would have found anywhere near as reasonably without having added the
> tests.  (For those of you who are test-centric, this is not a surprising
> outcome, but I'm adding this for those who may be inclined to think of it
> as an afterthought).
> >>>
> >>> -Todd
> >>>
> >>> On Mon, Oct 5, 2015 at 11:24 AM, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >>> On Fri, Sep 11, 2015 at 11:42 AM Jim Ingham <jingham at apple.com> wrote:
> >>> I have held from the beginning that the only tests that should be
> written using HandleCommand are those that explicitly test command
> behavior, and if it is possible to write a test using the SB API you should
> always do it that way for the very reasons you cite.  Not everybody agreed
> with me at first, so we ended up with a bunch of tests that do complex
> things using HandleCommand where they really ought not to.  I'm not sure it
> is worth the time to go rewrite all those tests, but we shouldn't write any
> new tests that way.
> >>>
> >>> I would like to revive this thread, because there doesn't seem to be
> consensus that this is the way to go.  I've suggested on a couple of
> reviews recently that people put new command api tests under a new
> top-level folder under tests, and so far the responses I've gotten have not
> indicated that people are willing to do this.
> >>>
> >>> Nobody chimed in on this thread with a disagreement, which indicates
> to me that we are ok with moving this forward.  So I'm reviving this in
> hopes that we can come to agreement.  With that in mind, my goal is:
> >>>
> >>> 1) Begin enforcing this on new CLs that go in.  We need to maintain a
> consistent message and direction for the project, and if this is a "good
> idea", then it should be applied and enforced consistently. Command api
> tests should be the exception, not the norm.
> >>
> >> You mean API tests should be the norm right? I don't want people
> submitting command line tests like "file a.out", "run", "step". I want the
> API to be used. Did you get this reversed?
> >> I didn't get it reversed, but I agree my wording wasn't clear.  By
> "command api", I meant HandleCommand / etc.  I *do* want the SB API to be
> used.
> >>
> >>>
> >>> 2) Begin rejecting or reverting changes that go in without tests.  I
> understand there are some situations where tests are difficult.  Core dumps
> and unwinding come to mind.  There are probably others.  But this is the
> exception, and not the norm.  Almost every change should go in with tests.
> >>
> >> As long as it can be tested reasonably I am fine with rejecting changes
> going in that don't have tests.
> >> One of the problem is that most changes go in without review.  I
> understand why this is, because Apple especially are code owners of more
> than 80% of LLDB, so people adhere to the post-commit review.  This is fine
> in principle, but if changes go in without tests and there was no
> corresponding code review, then my only option is to either keep pinging
> the commit thread in hopes I'll get a response (which I sometimes don't
> get), or revert the change.  Often though I get a response that says "Yea
> I'll get to adding tests eventually".  I especially want this last type of
> response to go the way of the dinosaur.  I don't know how to change
> peoples' habits, but if you could bring this up at you daily/weekly
> standups or somehow make sure everyone is on the same page, perhaps that
> would be a good start.  Reverting is the best way I know to handle this,
> because it forces a change.  But at the same time it's disruptive, so I
> really don't want to do it.
> >
> > I agree that reversion is aggressive and it would be better to have some
> nicer way to enforce this.  It is also a bit rigid and sometimes people's
> schedules make delaying the test-writing seem a very persuasive option.  We
> want to take that into account.  Maybe every time a change goes in that
> warrants a test but doesn't have one we file a bug against the author to
> write the tests, and mark it some way that is easy to find.  Then if you
> have more than N such bugs you can't make new checkins till you get them
> below N?  That way this work can be batch more easily to accommodate
> schedules but there are still consequences if you put it off too long.
>
That seems like a reasonable idea.  We'd need to decide what N was (Is 3
too low?), but in principle it sounds like a good idea.

> >
> >>
> >>>
> >>> 3) If a CL cannot be tested without a command api test due to
> limitations of the SB API, require new changes to go in with a
> corresponding SB API change.
> >>
> >> One issue here is I don't want stuff added to the SB API just so that
> it can be tested. The SB API must remain clean and consistent and remain an
> API that makes sense for debugging. I don't want internal goo being exposed
> just so we can test things. If we run into this a lot, we might need to
> make an alternate binary that can test internal unit tests. We could make a
> lldb_internal.so/lldb_internal.dylib/lldb_internal.dll that can be linked
> to by internal unit tests and then those unit tests can be run as part of
> the testing process. So lets keep the SB API clean and sensible with no
> extra fluff, and find a way to tests internal stuff in a different way.
> >> Agree that the SB API should be treated with care.  Unit tests are a
> good way around this as you mention.  We already have them running under
> the CMake build.  If you want to try it out, build with CMake and run
> "ninja check-lldb-unit".
> >
> > Another way to solve this (which we used in the Tcl/Tk project which has
> an extensive test suite written in Tcl) is to have a testing API that is
> for poking at internal bits explicitly for testing.  This is more stable
> than using the internal API's since presumably the testing API's are more
> surgical about what they need and so less exposed to changes.  It also
> means you can use the conveniences of working with Python where there are
> lots of examples and a more user-friendly environment when you have to do
> complex setup or test evaluation.  In the Tcl case it also would
> occasionally uncover functionality that was useful in the SB API's, since
> you might originally get a bunch of disparate bits of information in small
> special-purpose commands, but after a bit they will coalesce into a
> reasonable SB API.
>

The best place for using unit tests is in internal data structures that are
self-contained by their very nature.  For example, I have a CL that I'm
working on to get the python extension module working under Python 3.  I
had to refactor / rewrite some of the stuff in PythonDataObjects.  Then a
bunch of tests started failing.  It's really annoying to debug Python code
that calls through to native code.  If there were unit tests that tested
that all the different types of PythonObject worked correctly (string,
list, int, array, etc) I would have found the bugs immediately.

So I'm actually working on adding these unit tests right now before I
upload the CL for review.

Another example is back when I first started working on LLDB, there were a
bunch of windows bugs in the IRMemoryMap part of the expression evaluator.
But they were in really low level stuff like finding an item in a binary
tree, or allocating a block of memory.  It's way too low level to be
exposed through a language as high level as Python or Tcl, but at the same
time the tests are really useful because they make it really easy to find
bugs when you change something.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151007/115471e6/attachment-0001.html>