[llvm-commits] PATCH: Implement a SmallDenseMap container

Benjamin Kramer benny.kra at googlemail.com
Sat Jun 16 11:53:30 PDT 2012


On 16.06.2012, at 16:05, Chandler Carruth wrote:

> And here we are!
> 
> Functional highlights:
> 
> - Inline buffer will have proper alignment and size constraints, with no aliasing issues AFAICT.
> - Inline buffer re-uses the space typically occupied by the buffer pointer and buffer size, allowing for example SmallDenseMap<unsigned, unsigned, 2> to be the same size as DenseMap<unsigned, unsigned>.
> - Supports any types DenseMap supports, including those with non-trivial constructors, destructors, copy constructors, etc.
> - Significant attempts to minimize copying, but there is more to do here (see below).
> - Tests are automatically extended to cover the new implementation.
> 
> All that said, this patch is still quite rough I'm afraid. There is quite a bit of duplicated and/or fiddly code that I would like to pull out. It also needs lots of commenting.
> 
> It is surprisingly hard to implement the semantics of a hash-based map with the small-buffer optimization. This works, but it is far from clean. I hope it's still acceptable, as I suspect it is worth it in many situations.
> 
> I can either try to hack on this patch until it's a bit cleaner and then submit, or I can submit, and then clean it up incrementally as I go. I only have a mild preference for the latter approach.
> 
> I *think* I have all of the construction / destruction semantics correct, and I have most of the move semantics correct but not all of them. Hardening this and adding some tests that exercise at least correct invocation of constructors and destructors will be important follow-ups to this.
> 
> Anyways, comments welcome!

This looks very good already, though I'm not entirely sure the downsides of this design are worth it.

Is merging the pointer for the big case and the data for the small case really worth adding a branch to each lookup of the bucket number/bucket table? Since you are using the same lookup functions for both the big and the small case you could just have a pointer in the base class pointing to the small array or a malloc'd array. That would simplify the code a lot, getBuckets() could go away. For checking whether the map is small you'd just compare the pointer in the base class with the inline array.

It's of course a size vs. branches tradeoff, but I think the code simplification alone would make it a better solution.

- Ben



More information about the llvm-commits mailing list