[lld] r232460 - [ELF] Use parallel_for_each for writing.

Sean Silva chisophugis at gmail.com
Wed Mar 18 17:35:42 PDT 2015


Actually I'm wondering if we're doing *anything* in parallel, since perf is
reporting "0.999 CPUs utilized".

-- Sean Silva

On Wed, Mar 18, 2015 at 9:38 AM, Rui Ueyama <ruiu at google.com> wrote:

> It's not strange. Making something parallel doesn't always make it run
> faster. Oftentimes it makes thing even slower. That's the whole point why I
> emphasized the importance of accurate benchmark. (Note that this is a
> result of linking Clang. You might see different results depending on
> programs.)
>
> Rafael, it's the ELF writer. Unless you cross link ELF executables on
> Windows, this piece of code is not executed on Windows.
>
> On Wed, Mar 18, 2015 at 9:32 AM, Rafael EspĂ­ndola <
> rafael.espindola at gmail.com> wrote:
>
>> As with anything threading related, it might also be worth
>> benchmarking it on Windows.
>>
>> On 18 March 2015 at 12:31, Shankar Easwaran <shankare at codeaurora.org>
>> wrote:
>> > It looks like these are the right numbers and Strange, I dont see a huge
>> > advantage of the patch trying to parallelize writing output sections in
>> > parallel.
>> >
>> >
>> > On 3/18/2015 11:23 AM, Rafael EspĂ­ndola wrote:
>> >>
>> >> On 18 March 2015 at 12:14, Shankar Easwaran <shankare at codeaurora.org>
>> >> wrote:
>> >>>
>> >>> Does this repeat with the same numbers across similar tries ?
>> >>
>> >> The "-r 20" tells perf to do 20 runs. Repeating the entire thing for
>> >> sanity check I got
>> >>
>> >>
>> >> master:
>> >>         1850.315854      task-clock (msec)         #    0.999 CPUs
>> >> utilized            ( +-  0.20% )
>> >>               1,246      context-switches          #    0.673 K/sec
>> >>                   0      cpu-migrations            #    0.000 K/sec
>> >>                 ( +-100.00% )
>> >>             191,223      page-faults               #    0.103 M/sec
>> >>                 ( +-  0.00% )
>> >>       5,570,279,746      cycles                    #    3.010 GHz
>> >>                 ( +-  0.08% )
>> >>       3,076,652,220      stalled-cycles-frontend   #   55.23% frontend
>> >> cycles idle     ( +-  0.15% )
>> >>     <not supported>      stalled-cycles-backend
>> >>       6,061,467,442      instructions              #    1.09  insns per
>> >> cycle
>> >>                                                    #    0.51  stalled
>> >> cycles per insn  ( +-  0.00% )
>> >>       1,262,014,047      branches                  #  682.053 M/sec
>> >>                 ( +-  0.00% )
>> >>          26,526,169      branch-misses             #    2.10% of all
>> >> branches          ( +-  0.00% )
>> >>
>> >>         1.852094924 seconds time elapsed
>> >>            ( +-  0.20% )
>> >>
>> >> master minus your patch:
>> >>
>> >>         1837.986418      task-clock (msec)         #    0.999 CPUs
>> >> utilized            ( +-  0.01% )
>> >>               1,170      context-switches          #    0.637 K/sec
>> >>                   0      cpu-migrations            #    0.000 K/sec
>> >>             191,225      page-faults               #    0.104 M/sec
>> >>                 ( +-  0.00% )
>> >>       5,517,484,340      cycles                    #    3.002 GHz
>> >>                 ( +-  0.01% )
>> >>       3,036,583,530      stalled-cycles-frontend   #   55.04% frontend
>> >> cycles idle     ( +-  0.02% )
>> >>     <not supported>      stalled-cycles-backend
>> >>       6,004,436,870      instructions              #    1.09  insns per
>> >> cycle
>> >>                                                    #    0.51  stalled
>> >> cycles per insn  ( +-  0.00% )
>> >>       1,250,685,716      branches                  #  680.465 M/sec
>> >>                 ( +-  0.00% )
>> >>          26,539,486      branch-misses             #    2.12% of all
>> >> branches          ( +-  0.00% )
>> >>
>> >>         1.839759787 seconds time elapsed
>> >>            ( +-  0.01% )
>> >>
>> >>
>> >> Cheers,
>> >> Rafael
>> >>
>> >
>> >
>> > --
>> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> hosted by
>> > the Linux Foundation
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150318/f740669e/attachment.html>


More information about the llvm-commits mailing list