<div dir="ltr"><div>I reformat your results here. As you can see S/N is too low. Maybe we cannot say anything only from four data points.</div><div><br></div><div>LLD with patch</div><div>4.16user 0.80system 0:03.06elapsed 162%CPU (0avgtext+0avgdata 7174160maxresident)k</div><div>3.94user 0.86system 0:02.93elapsed 163%CPU (0avgtext+0avgdata 7175808maxresident)k</div><div>4.36user 1.05system 0:03.08elapsed 175%CPU (0avgtext+0avgdata 7176320maxresident)k</div><div>4.17user 0.72system 0:02.93elapsed 166%CPU (0avgtext+0avgdata 7175120maxresident)k</div><div><br></div><div>LLD without patch</div><div>4.49user 0.92system 0:03.32elapsed 162%CPU (0avgtext+0avgdata 7179984maxresident)k</div><div>4.12user 0.83system 0:03.22elapsed 154%CPU (0avgtext+0avgdata 7172704maxresident)k</div><div>4.38user 0.90system 0:03.14elapsed 168%CPU (0avgtext+0avgdata 7175600maxresident)k</div><div>4.20user 0.79system 0:03.08elapsed 161%CPU (0avgtext+0avgdata 7174864maxresident)k</div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 17, 2015 at 2:57 PM, Shankar Easwaran <span dir="ltr"><<a href="mailto:shankare@codeaurora.org" target="_blank">shankare@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div><br>
      I tried to measure this again with 4 tries and got results, to
      make sure just in case, and I see few results identical to what I
      measured before :-<br>
      <br>
      <u><b>Raw data below :-</b></u><br>
      <br>
      LLD Try With Patch #1<br>
      4.16user 0.80system 0:03.06elapsed 162%CPU (0avgtext+0avgdata
      7174160maxresident)k<br>
      LLD Try Without Patch #1<br>
      4.49user 0.92system 0:03.32elapsed 162%CPU (0avgtext+0avgdata
      7179984maxresident)k<br>
      BFD Try #1<br>
      7.81user 0.68system 0:08.53elapsed 99%CPU (0avgtext+0avgdata
      3230416maxresident)k<br>
      LLD Try With Patch #2<br>
      3.94user 0.86system 0:02.93elapsed 163%CPU (0avgtext+0avgdata
      7175808maxresident)k<br>
      LLD Try Without Patch #2<br>
      4.12user 0.83system 0:03.22elapsed 154%CPU (0avgtext+0avgdata
      7172704maxresident)k<br>
      BFD Try #2<br>
      7.78user 0.75system 0:08.57elapsed 99%CPU (0avgtext+0avgdata
      3230416maxresident)k<br>
      LLD Try With Patch #3<br>
      4.36user 1.05system 0:03.08elapsed 175%CPU (0avgtext+0avgdata
      7176320maxresident)k<br>
      LLD Try Without Patch #3<br>
      4.38user 0.90system 0:03.14elapsed 168%CPU (0avgtext+0avgdata
      7175600maxresident)k<br>
      BFD Try #3<br>
      7.78user 0.64system 0:08.46elapsed 99%CPU (0avgtext+0avgdata
      3230416maxresident)k<br>
      LLD Try With Patch #4<br>
      4.17user 0.72system 0:02.93elapsed 166%CPU (0avgtext+0avgdata
      7175120maxresident)k<br>
      LLD Try Without Patch #4<br>
      4.20user 0.79system 0:03.08elapsed 161%CPU (0avgtext+0avgdata
      7174864maxresident)k<br>
      BFD Try #4<br>
      7.77user 0.66system 0:08.46elapsed 99%CPU (0avgtext+0avgdata
      3230416maxresident)k<br>
      <br>
      <u><b>Questions :-</b></u><br>
      <br>
      As Rui mentions I dont know why the user time is more without the
      patch, any methods to verify this ?<br>
      Could this be because of user threads instead of kernel threads ?
      <b><br><span class="HOEnZb"><font color="#888888">
      </font></span></b><span class="HOEnZb"><font color="#888888"><br>
      Shankar Easwaran</font></span><div><div class="h5"><br>
      <br>
      On 3/17/2015 3:35 PM, Shankar Easwaran wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">Yes,
      this is true. There were several logs of runs in the same file
      that I read into the commit and manually removing them resulted in
      two user lines.
      <br>
      <br>
      But the result for all reasons is true. I can re-measure the time
      taken though.
      <br>
      <br>
      Shankar Easwaran
      <br>
      <br>
      On 3/17/2015 2:30 PM, Rui Ueyama wrote:
      <br>
      <blockquote type="cite">On Mon, Mar 16, 2015 at 8:29 PM, Shankar
        Easwaran <a href="mailto:shankare@codeaurora.org" target="_blank"><shankare@codeaurora.org></a>
        <br>
        wrote:
        <br>
        <br>
        <blockquote type="cite">Author: shankare
          <br>
          Date: Mon Mar 16 22:29:32 2015
          <br>
          New Revision: 232460
          <br>
          <br>
          URL:
          <a href="http://llvm.org/viewvc/llvm-project?rev=232460&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=232460&view=rev</a>
          <br>
          Log:
          <br>
          [ELF] Use parallel_for_each for writing.
          <br>
          <br>
          This changes improves performance of lld, when self-hosting
          lld, when
          <br>
          compared
          <br>
          with the bfd linker. BFD linker on average takes 8 seconds in
          elapsed time.
          <br>
          lld takes 3 seconds elapased time average. Without this
          change, lld takes
          <br>
          ~5
          <br>
          seconds average. The runtime comparisons were done on a
          release build and
          <br>
          measured by running linking thrice.
          <br>
          <br>
          lld self-host without the change
          <br>
          ----------------------------------
          <br>
          real    0m3.196s
          <br>
          user    0m4.580s
          <br>
          sys     0m0.832s
          <br>
          <br>
          lld self-host with lld
          <br>
          -----------------------
          <br>
          user    0m3.024s
          <br>
          user    0m3.252s
          <br>
          sys     0m0.796s
          <br>
          <br>
        </blockquote>
        The above results don't look real output of "time" command.
        <br>
        <br>
        If it's real, it's too good to be true, assuming the first line
        of the
        <br>
        second result is "real" instead of "user".
        <br>
        <br>
        "real" is wall clock time from process start to process exit.
        "user" is CPU
        <br>
        time consumed by the process in user mode (if a process is
        multi-threaded,
        <br>
        it can be larger than real).
        <br>
        <br>
        Your result shows significant improvement in user time. Which
        means you
        <br>
        have significantly reduced the amount of processing time to do
        the same
        <br>
        thing compared to before. However, because this change didn't
        change
        <br>
        algorithm, but just execute them in parallel, it couldn't
        happen.
        <br>
        <br>
        Something's not correct.
        <br>
        <br>
        I appreciate your effort to make LLD faster, but we need to be
        careful
        <br>
        about benchmark results. If we don't measure improvements
        accurately, it's
        <br>
        easy to make an "optimization" that makes things slower.
        <br>
        <br>
        Another important thing is to disbelieve what you do when you
        optimize
        <br>
        something and measure its effect. It sometimes happen that I
        believe
        <br>
        something is going to improve performance 100% sure but it
        actually
        <br>
        wouldn't.
        <br>
        <br>
        time taken to build lld with bfd
        <br>
        <blockquote type="cite">--------------------------------
          <br>
          real    0m8.419s
          <br>
          user    0m7.748s
          <br>
          sys     0m0.632s
          <br>
          <br>
          Modified:
          <br>
               lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h
          <br>
               lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h
          <br>
          <br>
          Modified: lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h
          <br>
          URL:
          <br>
<a href="http://llvm.org/viewvc/llvm-project/lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h?rev=232460&r1=232459&r2=232460&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h?rev=232460&r1=232459&r2=232460&view=diff</a>
          <br>
          <br>
==============================================================================
          <br>
          --- lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h
          (original)
          <br>
          +++ lld/trunk/lib/ReaderWriter/ELF/OutputELFWriter.h Mon Mar
          16 22:29:32
          <br>
          2015
          <br>
          @@ -586,8 +586,10 @@ std::error_code
          OutputELFWriter<ELFT>::w
          <br>
              _elfHeader->write(this, _layout, *buffer);
          <br>
              _programHeader->write(this, _layout, *buffer);
          <br>
          <br>
          -  for (auto section : _layout.sections())
          <br>
          -    section->write(this, _layout, *buffer);
          <br>
          +  auto sections = _layout.sections();
          <br>
          +  parallel_for_each(
          <br>
          +      sections.begin(), sections.end(),
          <br>
          +      [&](Chunk<ELFT> *section) {
          section->write(this, _layout, *buffer);
          <br>
          });
          <br>
              writeTask.end();
          <br>
          <br>
              ScopedTask commitTask(getDefaultDomain(), "ELF Writer
          commit to disk");
          <br>
          <br>
          Modified: lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h
          <br>
          URL:
          <br>
<a href="http://llvm.org/viewvc/llvm-project/lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h?rev=232460&r1=232459&r2=232460&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h?rev=232460&r1=232459&r2=232460&view=diff</a>
          <br>
          <br>
==============================================================================
          <br>
          --- lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h (original)
          <br>
          +++ lld/trunk/lib/ReaderWriter/ELF/SectionChunks.h Mon Mar 16
          22:29:32 2015
          <br>
          @@ -234,17 +234,17 @@ public:
          <br>
              /// routine gets called after the linker fixes up the
          virtual address
          <br>
              /// of the section
          <br>
              virtual void assignVirtualAddress(uint64_t addr) override
          {
          <br>
          -    for (auto &ai : _atoms) {
          <br>
          +    parallel_for_each(_atoms.begin(), _atoms.end(),
          [&](AtomLayout *ai) {
          <br>
                  ai->_virtualAddr = addr + ai->_fileOffset;
          <br>
          -    }
          <br>
          +    });
          <br>
              }
          <br>
          <br>
              /// \brief Set the file offset of each Atom in the
          section. This routine
          <br>
              /// gets called after the linker fixes up the section
          offset
          <br>
              void assignFileOffsets(uint64_t offset) override {
          <br>
          -    for (auto &ai : _atoms) {
          <br>
          +    parallel_for_each(_atoms.begin(), _atoms.end(),
          [&](AtomLayout *ai) {
          <br>
                  ai->_fileOffset = offset + ai->_fileOffset;
          <br>
          -    }
          <br>
          +    });
          <br>
              }
          <br>
          <br>
              /// \brief Find the Atom address given a name, this is
          needed to
          <br>
          properly
          <br>
          <br>
          <br>
          _______________________________________________
          <br>
          llvm-commits mailing list
          <br>
          <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a>
          <br>
          <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a>
          <br>
          <br>
        </blockquote>
      </blockquote>
      <br>
      <br>
    </blockquote>
    <br>
    <br>
    <pre cols="72">-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation</pre>
  </div></div></div>

</blockquote></div><br></div></div>