[PATCH] D14690: Don't force std::set for SmallSet

Duncan P. N. Exon Smith via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 16 16:40:14 PST 2015


> On 2015-Nov-16, at 15:22, Rafael EspĂ­ndola <rafael.espindola at gmail.com> wrote:
> 
>> I just counted (manually) with libc++ and it looks like 5 pointers
>> down to 3 pointers, so also 16B savings.
>> 
>> However, `DenseSet` does immediately call malloc for 256B at
>> construction time (forgot this until I looked at the code).  So
>> you're not actually getting the benefit of small storage.
>> 
>> I think for this to make sense you'd want to use a
>> `SmallDenseSet<T, 1>`.  SmallDenseSet doesn't exist yet though.
> 
> Well, I wrote the patch because our Map and Set are fairly inconsistent:
> 
> MapVector uses DenseMap.
> SetVector uses SmallSet which uses std::set.
> 
> So they have very different performance characteristics. I could also
> just change SetVector directly to use DenseSet. Would that be better?

That depends on whether someone is relying on the small-storage
optimization (by default) for SetVector :(.

> I went with modifying SmallSet in the hope of one day removing
> SmallPtrSet.

Sure, makes sense.  But if it's supposed to use the
small-storage optimization, it can't do such a big malloc by
default.

Is `SmallDenseSet<>` hard to write?  I figure you just need
to do something like:
--
template <class ..., class DenseMapTy> struct DenseSetBase {};
template <...> struct DenseSet
   : DenseSetBase<..., DenseMap<...>> {};
template <...> struct SmallDenseSet
   : DenseSetBase<..., SmallDenseMap<...>> {};
--

(Although if you're going there, it sure would be nice to pivot
the relationship between DenseSet and DenseMap.  It's really
awkward that the former inherits from the latter.  If anything
it should be the reverse, but they should probably have a
shared base instead.

I'd also appreciate having these three things separated out
somehow:
1. Is this a set or a map?
2. Does it use small storage (SmallDenseMap/SmallPtrSet)?
3. Does it use linear searches (SmallPtrSet) or quadratic
probing (SmallDenseMap) for the small storage?

I usually want linear searches for my small storage, since
quadratic probing jumps to big storage at 75% full.

(I don't really have a vision here, so maybe I'm just ranting.))


More information about the llvm-commits mailing list