[LLVMdev] std::string

Mon Jan 21 08:31:57 PST 2013

On Jan 19, 2013, at 10:04 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:

> Were the "small n" characteristics the main motivation?  Memory-wise, STL classes allow user-defined allocators, so use of things like memory pools should be relatively straightforward.  Just wondering...  :)

Just fyi:

http://home.roadrunner.com/~hinnant/stack_alloc.html

This may or may not be appropriate for llvm.  But it is an example of a fully C++11 conforming way to make a SmallContainer using a custom allocator and any C++11 std::container, or any container that mimics the std container requirements.

One thing to keep in mind when designing SmallContainer (either from scratch or using an allocator) is that there is a tension between internal buffer size and C++11 move semantics.  If the buffer is internal to the container (as is typical for std::string these days), the bigger the buffer the more expensive the container move members are.  If the container never needs to be moved, this isn't a concern of course.  But otherwise watch out for it.  Typically the cost of a container move operation is proportional to sizeof(container).  It was this very issue that drove the design of libc++'s std::string:  make the internal buffer as large as possible without increasing sizeof(string).

On Jan 20, 2013, at 11:39 AM, Benjamin Kramer <benny.kra at gmail.com> wrote:

> For example DenseMap is a hash table that embeds both keys and values into a big table of memory. This gives nice cache behavior for simple pointer-to-pointer mappings which are common throughout LLVM while std::map wastes memory and has worse cache behavior. For big keys DenseMap wastes a ton of memory and becomes slow, std::map is better for those cases.

Agreed.  Also comparing to std::unordered_map:  llvm::DenseMap (http://en.wikipedia.org/wiki/Hash_map#Open_addressing) is a completely different data structure than std::unordered_map (http://en.wikipedia.org/wiki/Hash_map#Separate_chaining).  Neither is best in all use cases.  llvm::DenseMap tends to run at very low load factors compared to std::unordered_map (i.e. carries around more empty buckets).  And std::unordered_map has a higher per-entry overhead which is significant when the entry is one or two words.

Howard