[llvm-dev] llvm and clang are getting slower

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 9 12:38:45 PST 2016


The lto time could be explained by second order effect due to increased
dcache/dtlb pressures due to increased memory footprint and poor locality.

David

On Tue, Mar 8, 2016 at 5:47 PM, Sean Silva via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
>
> On Tue, Mar 8, 2016 at 2:25 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>>
>> On Mar 8, 2016, at 1:09 PM, Sean Silva via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>
>>
>> On Tue, Mar 8, 2016 at 10:42 AM, Richard Smith via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> On Tue, Mar 8, 2016 at 8:13 AM, Rafael EspĂ­ndola
>>> <llvm-dev at lists.llvm.org> wrote:
>>> > I have just benchmarked building trunk llvm and clang in Debug,
>>> > Release and LTO modes (see the attached scrip for the cmake lines).
>>> >
>>> > The compilers used were clang 3.5, 3.6, 3.7, 3.8 and trunk. In all
>>> > cases I used the system libgcc and libstdc++.
>>> >
>>> > For release builds there is a monotonic increase in each version. From
>>> > 163 minutes with 3.5 to 212 minutes with trunk. For comparison, gcc
>>> > 5.3.2 takes 205 minutes.
>>> >
>>> > Debug and LTO show an improvement in 3.7, but have regressed again in
>>> 3.8.
>>>
>>> I'm curious how these times divide across Clang and various parts of
>>> LLVM; rerunning with -ftime-report and summing the numbers across all
>>> compiles could be interesting.
>>>
>>
>> Based on the results I posted upthread about the relative time spend in
>> the backend for debug vs release, we can estimate this.
>> To summarize:
>> 10% of time spent in LLVM for Debug
>> 33% of time spent in LLVM for Release
>> (I'll abbreviate "in LLVM" as just "backend"; this is "backend" from
>> clang's perspective)
>>
>> Let's look at the difference between 3.5 and trunk.
>>
>> For debug, the user time jumps from 174m50.251s to 197m9.932s.
>> That's {10490.3, 11829.9} seconds, respectively.
>> For release, the corresponding numbers are:
>> {9826.71, 12714.3} seconds.
>>
>> debug35 = 10490.251
>> debugTrunk = 11829.932
>>
>> debugTrunk/debug35 == 1.12771
>> debugRatio = 1.12771
>>
>> release35 = 9826.705
>> releaseTrunk = 12714.288
>>
>> releaseTrunk/release35 == 1.29385
>> releaseRatio = 1.29385
>>
>> For simplicity, let's use a simple linear model for the distribution of
>> slowdown between the frontend and backend: a constant factor slowdown for
>> the backend, and an independent constant factor slowdown for the frontend.
>> This gives the following linear system:
>> debugRatio = .1 * backendRatio + (1 - .1) * frontendRatio
>> releaseRatio = .33 * backendRatio + (1 - .33) * frontendRatio
>>
>> Solving this linear system we find that under this simple model, the
>> expected slowdown factors are:
>> backendRatio = 1.77783
>> frontendRatio = 1.05547
>>
>> Intuitively, backendRatio comes out larger in this comparison because we
>> see the biggest slowdown during release (1.29 vs 1.12), and during release
>> we are spending a larger fraction of time in the backend (33% vs 10%).
>>
>> Applying this same model to across Rafael's data, we find the following
>> (numbers have been rounded for clarity):
>>
>> transition       backendRatio   frontendRatio
>> 3.5->3.6         1.08           1.03
>> 3.6->3.7         1.30           0.95
>> 3.7->3.8         1.34           1.07
>> 3.8->trunk       0.98           1.02
>>
>> Note that in Rafael's measurements LTO is pretty similar to Release from
>> a CPU time (user time) standpoint. While the final LTO link takes a large
>> amount of real time, it is single threaded. Based on the real time numbers
>> the LTO link was only spending about 20 minutes single-threaded (i.e. about
>> 20 minutes CPU time), which is pretty small compared to the 300-400 minutes
>> of total CPU time. It would be interesting to see the numbers for -O0 or
>> -O1 per-TU together with LTO.
>>
>>
>>
>> Just a note about LTO being sequential: Rafael mentioned he was "building
>> trunk llvm and clang". By default I believe it is ~56 link targets that can
>> be run in parallel (provided you have enough RAM to avoid swapping).
>>
>
> D'oh! I was looking at the data wrong since I broke my Fundamental Rule of
> Looking At Data, namely: don't look at raw numbers in a table since you are
> likely to look at things wrong or form biases based on the order in which
> you look at the data points; *always* visualize. There is a significant
> difference between release and LTO. About 2x consistently.
>
> [image: Inline image 3]
>
> This is actually curious because during the release build, we were
> spending 33% of CPU time in the backend (as clang sees it; i.e. mid-level
> optimizer and codegen). This data is inconsistent with LTO simply being
> another run through the backend (which would be just +33% CPU time at
> worst). There seems to be something nonlinear happening.
> To make it worse, the LTO build has approximately a full Release
> optimization running per-TU, so the actual LTO step should be seeing
> inlined/"cleaned up" IR which should be much smaller than what the per-TU
> optimizer is seeing, so naively it should take *even less* than "another
> 33% CPU time" chunk.
> Yet we see 1.5x-2x difference:
>
> [image: Inline image 4]
>
> -- Sean Silva
>
>
>>
>> --
>> Mehdi
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160309/36bfa9e8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2016-03-08 at 5.45.54 PM.png
Type: image/png
Size: 39766 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160309/36bfa9e8/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2016-03-08 at 5.29.21 PM.png
Type: image/png
Size: 36008 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160309/36bfa9e8/attachment-0003.png>


More information about the llvm-dev mailing list