[llvm-dev] RFC: Should SmallVectors be smaller?
Dean Michael Berris via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 21 19:01:45 PDT 2018
> On 22 Jun 2018, at 02:52, Duncan P. N. Exon Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> I've been curious for a while whether SmallVectors have the right speed/memory tradeoff. It would be straightforward to shave off a couple of pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could afford to test for small-mode vs. large-mode.
>
> The current scheme works out to something like this:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector {
> T *BeginX, *EndX, *CapacityX;
> T Small[SmallCapacity];
>
> bool isSmall() const { return BeginX == Small; }
> T *begin() { return BeginX; }
> T *end() { return EndX; }
> size_t size() const { return EndX - BeginX; }
> size_t capacity() const { return CapacityX - BeginX; }
> };
> ```
>
> In the past I used something more like:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector2 {
> unsigned Size;
> unsigned Capacity;
> union {
> T Small[SmallCapacity];
> T *Large;
> };
>
> bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved off of Capacity.
> T *begin() { return isSmall() ? Small : Large; }
> T *end() { return begin() + Size; }
> size_t size() const { return Size; }
> size_t capacity() const { return Capacity; }
> };
> ```
>
> I'm curious whether this scheme would be really be slower in practice (as a complete replacement for `SmallVector` in ADT). I wonder, has anyone profiled something like this before? If so, in what context? on what workloads?
>
Doesn’t this scheme have a problem with undefined behaviour, since you may be changing the active member of the union when capacity grows larger than SmallCapacity?
-- Dean
More information about the llvm-dev
mailing list