[PATCH] Add standard insert overloads to StringMap

David Blaikie dblaikie at gmail.com
Mon Jun 16 16:01:19 PDT 2014


On Mon, Jun 16, 2014 at 3:36 PM, Agustín Bergé <kaballo86 at hotmail.com> wrote:
>>> I would like to add a test case that triggers a table rehash but I'm not sure how to do that, suggestions?
>>
>> Not sure, actually. It looks like the growth function is the usual sort of thing, so probably just inserting a second element should cause a rehash - no?
>
> I did not see this in my previous tests using up to 10 values, because the hash table starts with 16 buckets. I see that I can specify how many buckets to use on construction, so creating a map with only 1 bucket and adding two elements to it would trigger a rehash. Then I'd have to make sure that the rehash actually moves the bucket around, which is the tricky part.

Yep, you should be able to do that, as you say, by starting with a
bucket count of 1 then if you want to make sure it actually caused a
re-hash, you could just assert that the address of some element is
different after the operation you expect to be a rehash.

>
> ================
> Comment at: include/llvm/ADT/StringMap.h:338
> @@ +337,3 @@
> +
> +    MapEntryTy *NewItem =
> +        MapEntryTy::Create(KV.first, Allocator, std::move(KV.second));
> ----------------
> David Blaikie wrote:
>> Do you need the "NewItem" variable? Could you just assign straight to Bucket?
>>
>> (you could do the Bucket == getTombstoneVal() before the MapEntryTy::Create? Or does Create need to worry about the old tombstone count?)
> I followed the implementation of `GetOrCreateValue` here, but I don't see any obvious reason why I couldn't just do the assignment directly.
>
> ================
> Comment at: lib/Support/StringMap.cpp:214
> @@ -213,3 +211,3 @@
>        unsigned FullHash = HashTable[I];
> -      unsigned NewBucket = FullHash & (NewSize-1);
>        if (!NewTableArray[NewBucket]) {
> ----------------
> David Blaikie wrote:
>> Please remove formatting changes to lines unrelated to your change.
>>
>> If you're using clang-format to reformat, you can use the git/diff clang format scripts to just reformat the lines your code changes.
>>
> Thanks for that tip! I was wondering how people were able to use clang-format in situations like these.
>
> ================
> Comment at: unittests/ADT/StringMapTest.cpp:215
> @@ +214,3 @@
> +  EXPECT_EQ(testValue, testMap[testKeyFirst]);
> +  EXPECT_EQ(NewIt->first(), testKeyFirst);
> +  EXPECT_EQ(NewIt->second, testValue);
> ----------------
> David Blaikie wrote:
>> That's an annoyingly creepy difference (first being a function, second being a value) - probably be good if first was a StringRef member to begin with. Again, not your problem though.
> I had initially added the `insert` overload doing batch insertion, which is an interesting one to have as it could avoid extra table rehashing. I did not include it in this patch because of the weirdness of `StringMapIterator`, which isn't even an _Iterator_.
>
> This would be a big breaking change, but if that is something acceptable I could look into making `StringMap` more like an `std::map`, or at least more like `DenseMap`. This would be of course not part of this patch.

Just making the iterator more like an iterator's probably a fine
separate thing to do - we do have a variety of non-iterator iterators
in LLVM and I'm personally all in favor of making them more like real
iterators.

>
> ================
> Comment at: unittests/ADT/StringMapTest.cpp:221
> @@ +220,3 @@
> +  std::tie(ExistingIt, Inserted) =
> +      testMap.insert(std::make_pair(testKeyFirst, testValue + 1));
> +  EXPECT_EQ(1u, testMap.size());
> ----------------
> David Blaikie wrote:
>> While it's true this (testValue  + 1) is an xvalue, it's not really an interesting xvalue - a test case with a move-only value type might be more interesting and fully exercise the std::move stuff in insert.
> My intention here is actually to test that the value is not updated when the pair already exists in the map. Since `insert` is taking the argument by value, I did not considered adding a special test case for moves. In order for such test to be meaningful, it would have to keep track of the number of copies, and `StringMap` doesn't fully work with movable-only types (`StringMapEntry::set_value` makes copies).

If you check SmallVector's tests you'll find a Copyabel type that
keeps counts of different operations - you could move that to a common
utility file and reuse it here if you like - not a necessity for
commit though.

It's OK if a container has some operations that are not move-only
compatible (if there's a reason for those operations to be not
move-only compatible), but if the particular problem you mentioned
makes it impossible to do anything with a StringMap of move-only
values, so be it. Again, might be nice possible cleanup to see if that
requirement could be removed.

- David




More information about the llvm-commits mailing list