<div dir="ltr">Ah, awesome - looks great. Thanks for the reference!</div><br><div class="gmail_quote"><div dir="ltr">On Mon, Nov 28, 2016 at 9:38 AM Rui Ueyama <<a href="mailto:ruiu@google.com">ruiu@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">Sorry, I mean <a href="https://reviews.llvm.org/D27155" class="gmail_msg" target="_blank">https://reviews.llvm.org/D27155</a><br class="gmail_msg"></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Mon, Nov 28, 2016 at 9:37 AM, Rui Ueyama <span dir="ltr" class="gmail_msg"><<a href="mailto:ruiu@google.com" class="gmail_msg" target="_blank">ruiu@google.com</a>></span> wrote:<br class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">I eventually wrote three patches for this, and <a href="https://reviews.llvm.org/D27146" class="gmail_msg" target="_blank">https://reviews.llvm.org/D27146</a> is most promising. (If you are not aware of that / haven't reached to the top your mail inbox yet.)<div class="gmail_msg"><div class="m_-8952235112187918002h5 gmail_msg"><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Mon, Nov 28, 2016 at 9:33 AM, David Blaikie <span dir="ltr" class="gmail_msg"><<a href="mailto:dblaikie@gmail.com" class="gmail_msg" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">(not much to add except that I kind of love this - really neat idea/direction to pursue/play with possibilities)<br class="gmail_msg"><br class="gmail_msg">As for making this stable though probabilistic: any chance of seeding the RNG with a known value to get stability? (possibly using some of the input contents as the seed, if that's helpful) - still risks pathological cases, I suppose, but should be OK?<br class="gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg"><div class="gmail_msg"><div class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956h5 gmail_msg"><div dir="ltr" class="gmail_msg">On Sat, Nov 26, 2016 at 9:11 PM Rui Ueyama via Phabricator via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" class="gmail_msg" target="_blank">llvm-commits@lists.llvm.org</a>> wrote:<br class="gmail_msg"></div></div></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_msg"><div class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956h5 gmail_msg">ruiu created this revision.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
ruiu added reviewers: rafael, silvas.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
ruiu added a subscriber: llvm-commits.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
I'm sending this patch to get fedback. I haven't convince even myself<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
that this is the right thing to do. But this should be interesting<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
to those who want to see what we can do to improve linker's latency.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
String merging is one of the slowest passes in LLD because of the<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
sheer number of mergeable strings. For example, Clang with debug info<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
contains 30 millions of mergeable strings (average length is about 50<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
bytes). They need to be uniquified, and uniquified strings need to<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
get consecutive offsets in the resulting string table.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
Currently, we are using a (single-threaded, regular) dense map for<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
string unification. Merging the 30 million strings takes about 2<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
seconds on my machine.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
This patch implements one of my ideas about how to reduce latency by<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
parallelizing it. This algorithm is probabilistic, meaining that<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
even though duplicated strings are likely to be merged, that's not<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
guaranteed. As a result, it produces larger string table quickly.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
(If you need to optimize in size, you could still pass -O2 which<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
does tail-merging.)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
Here's how it works.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
In the first step, we take 10% of input string set to create a small<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
string table. The resulting string table is very unlikely to contain<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
all strings of the entire set, but it is likely to contain most of<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
duplicated strings, because duplicated strings are repeated many times.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
The second step processes the remaining 90% in parallel. In this step,<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
we do not merge strings. So, if a string is not in the small string<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
table we created in the first step, that will just be appended to end<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
of the string table. This step completes the string table.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
Here are some numbers of resulting clang executables:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Size of .debug_str section:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Current            108,049,822   (+0%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Probabilistic      154,089,550   (+42.6%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  No string merging  1,591,388,940 (+1472.8%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Size of resulting file:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Current            1,440,453,528 (+0%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Probabilistic      1,490,597,448 (+3.5%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  No string merging  2,945,020,808 (+204.5%)<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
The probabilistic algorithm produces larger string table, but that's<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
much smaller than that without string merging. Compared to the entire<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
executable size, the loss is only 3.5%.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
Here is a speedup in latency:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  Before:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
     36098.025468 task-clock (msec)         #    5.256 CPUs utilized            ( +-  0.95% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
          190,770 context-switches          #    0.005 M/sec                    ( +-  0.25% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
            7,609 cpu-migrations            #    0.211 K/sec                    ( +- 11.40% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
        2,378,416 page-faults               #    0.066 M/sec                    ( +-  0.07% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
   99,645,202,279 cycles                    #    2.760 GHz                      ( +-  0.94% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
   81,128,226,367 stalled-cycles-frontend   #   81.42% frontend cycles idle     ( +-  1.10% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  <not supported> stalled-cycles-backend<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
   45,662,681,567 instructions              #    0.46  insns per cycle<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
                                            #    1.78  stalled cycles per insn  ( +-  0.14% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
    8,864,616,311 branches                  #  245.571 M/sec                    ( +-  0.22% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
      146,360,227 branch-misses             #    1.65% of all branches          ( +-  0.06% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
      6.868559257 seconds time elapsed                                          ( +-  0.50% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  After:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
     36905.733802 task-clock (msec)         #    7.061 CPUs utilized            ( +-  0.84% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
          159,813 context-switches          #    0.004 M/sec                    ( +-  0.24% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
            8,079 cpu-migrations            #    0.219 K/sec                    ( +- 12.67% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
        2,296,298 page-faults               #    0.062 M/sec                    ( +-  0.21% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  102,178,380,224 cycles                    #    2.769 GHz                      ( +-  0.83% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
   83,846,653,367 stalled-cycles-frontend   #   82.06% frontend cycles idle     ( +-  0.96% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  <not supported> stalled-cycles-backend<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
   46,138,345,206 instructions              #    0.45  insns per cycle<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
                                            #    1.82  stalled cycles per insn  ( +-  0.15% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
    8,824,763,690 branches                  #  239.116 M/sec                    ( +-  0.24% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
      142,482,338 branch-misses             #    1.61% of all branches          ( +-  0.05% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
      5.227024403 seconds time elapsed                                          ( +-  0.43% )<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
In terms of latency, this algorithm is a clear win.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
With these results, I have a feeling that this algorithm could be<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
a reasonable addition to LLD. Only for a few percent of loss in size,<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
it reduces latency by about 25%, so it might be a good option for<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
daily edit-build-test cycles (on the other hand, disabling string<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
merging with -O0 creates 2x larger executables, which is sometimes<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
inconvenient even for daily development cycle.) You can still pass<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
-O2 to produce production binaries.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
I have another idea to reduce string merging latency, so I'll<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
implement that later for comparison.<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<a href="https://reviews.llvm.org/D27146" rel="noreferrer" class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg" target="_blank">https://reviews.llvm.org/D27146</a><br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
Files:<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  ELF/InputSection.h<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  ELF/OutputSections.cpp<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
  ELF/OutputSections.h<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg"></div></div>
_______________________________________________<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
llvm-commits mailing list<br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<a href="mailto:llvm-commits@lists.llvm.org" class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg" target="_blank">llvm-commits@lists.llvm.org</a><br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br class="m_-8952235112187918002m_6399262071785862524m_-3933300644678683956m_1448663242386012035gmail_msg gmail_msg">
</blockquote></div></div>
</blockquote></div><br class="gmail_msg"></div></div></div></div>
</blockquote></div><br class="gmail_msg"></div>
</blockquote></div>