[PATCH] D18345: Fix DenseMap::reserve(): the formula was wrong

Tue Mar 22 09:46:39 PDT 2016

On Tue, Mar 22, 2016 at 9:22 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Mar 22, 2016, at 8:59 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Tue, Mar 22, 2016 at 8:53 AM, Mehdi Amini via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>>
>> > On Mar 22, 2016, at 8:36 AM, David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> > dblaikie added inline comments.
>> >
>> > ================
>> > Comment at: unittests/ADT/DenseMapTest.cpp:367
>> > @@ +366,3 @@
>> > +// buckets to insert N items without increasing allocation size.
>> > +TEST(DenseMapCustomTest, InitialSizeTest) {
>> > +  for (unsigned Size = 1; Size <= 1024; ++Size) {
>> > ----------------
>> > If both the ctor and reserve now use common code, might not be
>> necessary to test all parts of that code through both access points.
>>
>> The point of unit testing is also to make sure no future change breaks a
>> user facing expectation, so it is not totally useless.
>> But the reserve case should also include a test for a map that would
>> already contains element, or a map initialized with some size, then
>> reserved with an extra size.
>>
>
> Sure - it seems like the reserve case would test a superset of the ctor
> case. I'd be inclined just to test one or two cases in the ctor, then
> implement it in such a way that it's mostly just "calls reserve".
>
>
> Note that the implementation of the ctor *does not* call reserve(), it
> calls init() instead which deals with initial setup and
> calls allocateBuckets(), while reserve() on the other side calls grow(). I
> refactored the code that compute the number of buckets for a desired number
> of entries reserved, but this is the only thing shared.
>
>
> We can't really test all possible implementations today to ensure that
> future changes don't break the implementation - I agree, there's certainly
> some balance (testing generic edge cases that are likely to be hiccups,
> even if they aren't issues for today's algorithm). But I'd like there to be
> some more specific justification than a fairly unbounded "who knows how it
> might work tomorrow".
>
>
> The original test was already looping over an arbitrary amount of sizes, I
> don't know where to draw the line and I don't have any rational to drive
> any decision here, what do you suggest?
> (I your concern is only runtime I can s/1024/128?)
>

The usual approach (unit testing textbooks, college courses, etc - which
for sure are an oversimplification at times, I realize) is to test the
boundary cases, and representative case from any categories the algorithm
implies.

I would tend to suggest doing just that - whatever the smallest example was
that wasn't covered by the original code is what I would test (& I'd
probably remove a bunch of the other scattershot tests). Sounds like, from
your later comments, that 48 would be the right value. Maybe there are a
couple of nearby values across the tipping points in the algorithm? But
with the large minimium allocation there aren't so many interesting cases
for that bound, I Think...

(speaking of - should the min of 512 apply when the user specifies a
smaller bound? Might be worth allowing the user to pull down the initial
allocation if they know they don't need much? Not sure)

To alleviate the concern that someone might refactor the code in such a way
to make the tests invalid, it might be worth adding negative tests - just
one test that demonstrates that default-constructing a DenseMap /will/
reallocate at a certain number of insertions? (default construct, insert
things, EXPECT no extra move ops, insert the thing that should be beyond
the initial allocation size, EXPECT all the extra move ops)

Do you think that'd be good?

>
>
> (for the same reason we don't test, say, reserve after adding an element
> in each possible way an element can be added - we assume the way elements
> are added (but not the fact that they were added) is orthogonal to reserve)
>
>
> "Orthogonality" seems... orthogonal :) to what we're discussing here.
>

Somewhat - but what I mean is two codepaths that lead to the same code
(insert+reserve or []+reserve for example) don't usually deserve two tests.

>
>
>
>
>>
>> >
>> > ================
>> > Comment at: unittests/ADT/DenseMapTest.cpp:368
>> > @@ +367,3 @@
>> > +TEST(DenseMapCustomTest, InitialSizeTest) {
>> > +  for (unsigned Size = 1; Size <= 1024; ++Size) {
>> > +    DenseMap<unsigned, CountCopyAndMove> Map(Size);
>> > ----------------
>> > Do we need to test it for all these values? Or could we be a bit more
>> specific? (I worry about shotgun testing like this - hard to maintain
>> (because it's not clear what cases it's testing) & can add time to our test
>> runs, make them harder to debug (because they're doing so much
>> unintended/unnecessary work), etc)
>>
>> Runtime is 5ms for this test.
>> Yeah that's annoying, we could do a bit more "white box testing", but it
>> is not clear how robust it would be against future change of the internals
>> of the map. What are you suggesting (beside adding a comment)?
>>
>
> Well, what motivated you to choose 1024?
>
>
> I flipped a coin ;)
>
> Why not 470 (is a power of two significant)?
>
>
> No good reason to choose a power of two.
>
> or 5 (is such a large number significant)?
>
>
> Size is significant (see below).
>
> (sorry, I don't mean for that to sound facetious/confrontational, but I'd
> like to have some justification/discussion when adding test cases (in both
> directions, to be sure - to make sure we have enough testing, but also not
> too much))
>
>
> That's fine! And I don't try to be difficult, I'm really not having any
> good justification or idea on how to "prune" smartly such a test.
> What I can say is that the bug I fixed showed up only:
>
> - showed up only for specific values (i.e. 4 * nEntries == 3 * nBuckets).
> - the number of buckets had to be more than 64 (min initial alloc).
>
> I didn't go with the minimum (48) because someone might figure that a
> minimum initialization of 512 for the map would be nice, I figure that 1024
> was somehow reasonable, not that I was totally happy with it (and it was
> late last night...).
>

No worries - I might not be entirely awake either :)

>
> --
> Mehdi
>
>
>
>
> - Dave
>
>
>>
>> Thanks,
>>
>> Mehdi
>>
>>
>>
>> >
>> > ================
>> > Comment at: unittests/ADT/DenseMapTest.cpp:377
>> > @@ +376,3 @@
>> > +    EXPECT_EQ(Size * 2, CountCopyAndMove::Move);
>> > +    EXPECT_EQ((unsigned)0, CountCopyAndMove::Copy);
>> > +  }
>> > ----------------
>> > You could use '0u' for the unsigned 0 literal rather than a cast
>> >
>> > ================
>> > Comment at: unittests/ADT/DenseMapTest.cpp:384
>> > @@ +383,3 @@
>> > +TEST(DenseMapCustomTest, ReserveTest) {
>> > +  for (unsigned Size = 1; Size <= 1024; ++Size) {
>> > +    DenseMap<unsigned, CountCopyAndMove> Map;
>> > ----------------
>> > Same question here
>> >
>> >
>> > http://reviews.llvm.org/D18345
>> >
>> >
>> >
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160322/3f0db687/attachment.html>