[PATCH] D38433: Introduce a specialized data structure to be used in a subsequent change

Chandler Carruth via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 10 13:20:14 PDT 2017


chandlerc added a comment.

In https://reviews.llvm.org/D38433#893658, @sanjoy wrote:

> (Haven't addressed the code comments yet since the design isn't settled)
>
> In https://reviews.llvm.org/D38433#893426, @chandlerc wrote:
>
> > Have you considered building a `ChunkedVector` instead of a `ChunkedList`? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.
>
>
> With a vector-of-buffers implementation, I'm a bit worried about the space overhead on the smaller cases.  For instance, this is the histogram of how this data structure is populated from a clang-bootstrap (also in https://reviews.llvm.org/D38434):
>
>        Count: 731310
>          Min: 1
>         Mean: 8.555150
>   50th %tile: 4
>   95th %tile: 25
>   99th %tile: 53
>          Max: 433
>
>
> If I used a vector-of-buffers, I will either have to recompute the capacity and end (of the last buffer) on every insert (which will require an additional deref and some computation) or have to keep two words in the data structure over the three that smallvector keeps anyway.  This adds a lot of relative overhead on the median case (4 elements).  In fact, the current situation of two extra words also qualifies as a "lot" of relative overhead IMO, and I want to think of an SSO to improve the situation.


Hold on, the objects here are just pointers? Then none of this really makes sense to me...

Chunked data structures seem to make the most sense if moving the objects is really expensive and/or the objects are really large.

For pointers, why not just a vector?


https://reviews.llvm.org/D38433





More information about the llvm-commits mailing list