<div class="gmail_extra"><div class="gmail_quote">On Tue, Aug 21, 2012 at 9:54 AM, Sean Silva <span dir="ltr"><<a href="mailto:silvas@purdue.edu" target="_blank" class="cremed">silvas@purdue.edu</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">> DenseSet is implemented as a DenseMap<foo, char>, if you stick a pointer in<br>

> there every entry is sizeof(void *) + 1 + padding = 2 * sizeof(void*), a<br>

> significant memory overhead.<br>

<br>

</div>Currently DenseMap uses just a regular pair<K,V> as the bucket type.<br>

If this were replaced with something like libc++'s __compressed_pair,<br>

then you could just use an empty struct as the mapped_type and this<br>

overhead would go away. Does LLVM have something like<br>

__compressed_pair already? I can't seem to find one, but I'd like a<br>

second opinion.<br>

<br>

If it doesn't, then I'll put together a patch adding a compressed<br>

pair, and then fixup DenseMap to use it. It seems like almost every<br>

use of DenseSet in the LLVM/Clang codebase would benefit from this<br>

(bucket array would become half as big!).<br></blockquote><div><br></div><div>I'm not at all sure this complexity is worth it. Are you sure that the DenseSet instances in the codebase are really suffering? I think we use SmallPtrSet (which doesn't have this problem IIRC) much more commonly.</div>

<div><br></div><div>If you have some compelling reason to invest so much energy here, I would rather see a direct set implementation, with a map implementation layered on top of it, than compressed pair hacking to make a map actually look more like a set.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class="HOEnZb"><font color="#888888"><br>

--Sean Silva<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

On Tue, Aug 21, 2012 at 8:23 AM, Benjamin Kramer <<a href="mailto:benny.kra@gmail.com" class="cremed">benny.kra@gmail.com</a>> wrote:<br>

><br>

> On 21.08.2012, at 05:19, Sean Silva <<a href="mailto:silvas@purdue.edu" class="cremed">silvas@purdue.edu</a>> wrote:<br>

><br>

>> Would SmallDenseMap be more efficient here since SmallPtrSet does a<br>

>> linear scan? There's also a SmallPtrSet of size 64 in the DAG combiner<br>

>> that I think would significantly benefit from becoming a SmallDenseMap<br>

>> (64 means that when it is pretty large but still in its small storage<br>

>> it does a linear scan over up to 8 cache lines).<br>

>><br>

>> More generally, in what situations is a SmallPtrSet going to be<br>

>> preferable to a SmallDenseMap?<br>

><br>

> For only 8 entries the entire linearly scanned buffer fits into 8*8=64 bytes on<br>

> x86_64, which should be really fast even with a linear scan, maybe even faster<br>

> than doing a hash lookup.<br>

><br>

> For 64*8 bytes the situation is different, that SmallPtrSet should be either<br>

> shrunk or replaced with a SmallDenseMap (which carries its own overhead because<br>

> it stores keys and values).<br>

><br>

> DenseSet is implemented as a DenseMap<foo, char>, if you stick a pointer in<br>

> there every entry is sizeof(void *) + 1 + padding = 2 * sizeof(void*), a<br>

> significant memory overhead.<br>

><br>

> Of course, if malloc thrashing in turn dwarves this overhead, SmallDenseMap<br>

> would be a good idea. Real numbers would be nice but I guess it's hard to<br>

> find a test case where llvm spends a lot of time in badly configured<br>

> SmallPtrSets.<br>

><br>

> - Ben<br>

><br>

_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu" class="cremed">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank" class="cremed">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</div></div></blockquote></div><br></div>