<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Mar 15, 2016, at 4:09 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" class="">listmail@philipreames.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<br class="">
<br class="">
<div class="moz-cite-prefix">On 03/15/2016 03:07 PM, via llvm-dev
wrote:<br class="">
</div>
<blockquote cite="mid:B463FEBC-9600-4BFA-8451-F02A648E6274@apple.com" type="cite" class="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" class="">
There’s a few passes in LLVM that make heavy use of a big
DenseMap, one that potentially gets filled with up to 1 entry for
each instruction in the function. EarlyCSE is the best example,
but Reassociate and MachineCSE have this to some degree as well
(there might be others?). To put it simply: at least in my
profile, EarlyCSE spends ~1/5 of its time growing DenseMaps. This
is kind of… bad.
<div class=""><br class="">
</div>
<div class="">grow() is inherently slow because it needs to rehash
and reinsert everything. This means growing a DenseMap costs
much, much more than growing, for example, a vector. I talked
about this with a few people and here are some possibilities
we’ve come up with to improve this (some of which probably
aren’t what we want):</div>
<div class=""><br class="">
</div>
<div class="">1. Use a map that doesn’t require rehashing and
reinsertion to grow. Chaining lets you do this, but
std::unordered_map is probably so much slower than DenseMap we’d
lose more than we gain.</div>
<div class="">2. Include the hash code in the map so that we don’t
have to rehash. 32 bits more per entry (or whatever), and it
might not help that much, since we still have to do the whole
reinsertion routine.</div>
<div class="">3. Pre-calculate an estimate as to the map size we
need. For example, in EarlyCSE, this is possibly gross
overestimate of size needed:</div>
<div class=""><br class="">
</div>
<div class="">
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">unsigned</span> InstCount = <span style="font-variant-ligatures: no-common-ligatures; color:
#272ad8" class="">0</span>;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">unsigned</span> LoadCount = <span style="font-variant-ligatures: no-common-ligatures; color:
#272ad8" class="">0</span>;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">unsigned</span> CallCount = <span style="font-variant-ligatures: no-common-ligatures; color:
#272ad8" class="">0</span>;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">for</span> (<span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">inst_iterator</span> FI = <span style="font-variant-ligatures: no-common-ligatures; color:
#31595d" class="">inst_begin</span>(<span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">F</span>), FE = <span style="font-variant-ligatures: no-common-ligatures; color:
#31595d" class="">inst_end</span>(<span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">F</span>); FI != FE; ++FI) {</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo; color: rgb(49, 89, 93);" class=""><span style="" class=""> </span><span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">if</span><span style="" class=""> (FI-></span>mayReadOrWriteMemory<span style="" class="">())</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> ++LoadCount;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">else</span> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">if</span> (<span style="font-variant-ligatures: no-common-ligatures; color:
#31595d" class="">isa</span><<span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">CallInst</span>>(*FI))</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> ++CallCount;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#bb2ca2" class="">else</span></div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> ++InstCount;</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> }</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">AvailableValues</span>.<span style="font-variant-ligatures: no-common-ligatures; color:
#3d1d81" class="">resize</span>(InstCount);</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">AvailableLoads</span>.<span style="font-variant-ligatures: no-common-ligatures; color:
#3d1d81" class="">resize</span>(LoadCount);</div>
<div style="margin: 0px; font-size: 11px; line-height: normal;
font-family: Menlo;" class=""> <span style="font-variant-ligatures: no-common-ligatures; color:
#4f8187" class="">AvailableCalls</span>.<span style="font-variant-ligatures: no-common-ligatures; color:
#3d1d81" class="">resize</span>(CallCount);</div>
</div>
<div class=""><br class="">
</div>
<div class="">But it does the job, and saves ~20% of time in
EarlyCSE on my profiles. Yes, iterating over the entire function
is way cheaper than grow(). Downsides are that while it’s still
bounded by function size, it could end up allocating a good bit
more depending on — in EarlyCSE’s case — the control
flow/dominator structure.</div>
<div class=""><br class="">
</div>
<div class="">Any thoughts on this, or other less ugly
alternatives? I estimate that, at least in our pass pipeline,
we’re losing at least ~1% of total time to avoidable
DenseMap::grow() operations, which feels a little bit…
unnecessary.</div>
</blockquote>
<br class="">
Slightly OT, but it looks like the LoadValue (value type of the
AvailableLoads structure) is relatively memory inefficient. At
minimum, we could get rid of the IsAtomic space by using a tagged
pointer. That would at least bring us down to a 128 bits (a nice
power of two). That might help speed up some of the copying. </div></div></blockquote><br class=""></div><div>Just to note, it looks like I was testing on a somewhat older LLVM version that didn’t have LoadValue at all, so my guess is this means it’s even -worse- in trunk.</div><div><br class=""></div><div>—escha</div></body></html>