[cfe-dev] Massive performance regression in clang 3.9

Richard Smith via cfe-dev cfe-dev at lists.llvm.org
Wed Oct 19 15:39:30 PDT 2016


On Wed, Oct 19, 2016 at 9:08 AM, Richard Smith <richard at metafoo.co.uk>
wrote:

> On 19 Oct 2016 2:44 am, "Andy Gibbs" <andyg1001 at hotmail.co.uk> wrote:
>
> Hi,
>
> I have observed a massive performance regression when upgrading from clang
> 3.7
> to 3.9.  I have also checked that current tip-of-trunk seems to have this
> regression too.
>
> I have managed to isolate the patch at which the regression appeared as
> r248431 on 23rd Sept 2015.
>
> I have attached a simple test-case that demonstrates the problem rather
> effectively.
>
> Prior to r248431, this code would compile in under 20 seconds, consuming
> approx. 400Mb of memory.  GCC, for comparison, takes longer at around 50
> seconds on my machine but consumes only ~250Mb.
>
> Following patch r248431 and all the way up to tip-of-trunk, clang is
> terminated by the OOM handler due excessive memory usage in excess of 7
> Gigabytes!!!
>
> This seems a little unreasonable -- the test-code in question is a little
> pathological maybe but is designed to exploit template memoisation, so the
> excessive memory usage would imply to me that this memoisation is no longer
> working.
>
>
> That's not the problem. The issue is that this:
>
> (sizeof...(Is) + Is)...
>
> substitutes into the sizeof... subexpression once for each element of Is,
> which massively amplifies any change in the behaviour of sizeof...
>
> r248431 changes our behaviour such that sizeof... now scans its
> parameter's expansion looking for unexpanded packs (which can appear if
> it's used inside an alias template), making the instantiation of that
> fragment quadratic in the size of the pack Is.
>
> I'll take a look and see if we can handle this a bit better.
>

As of r284653, the performance and memory usage for simple sizeof...
expansions should be a lot better. But your approach is still O(N^2 log N)
when compiled with Clang. You can remove a factor of N by passing around
the size of the pack instead of recomputing it:

template <unsigned Size, typename List, bool Odd>
struct PartialVariadicIndicesSequence;

template <unsigned Size, unsigned... Is>
struct PartialVariadicIndicesSequence<Size, VariadicIndices<Is...>, false> {
  using Type = VariadicIndices<Is..., Size + Is ...>;
};

template <unsigned Size, unsigned... Is>
struct PartialVariadicIndicesSequence<Size, VariadicIndices<Is...>, true> {
  using Type = VariadicIndices<Is..., Size + Is ..., Size * 2>;
};

template <unsigned N>
struct MakeVariadicIndicesImpl : PartialVariadicIndicesSequence<
    N / 2,
    typename MakeVariadicIndicesImpl<N / 2>::Type,
    ((N % 2) != 0)
  > { };

Is anyone else able to confirm my findings?  If so, I will raise this as a
> bug
> report.  Richard, I have copied you as the author of the patch in question
> (hope you don't mind!).
>
> Cheers,
> Andy
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161019/a98a777f/attachment.html>


More information about the cfe-dev mailing list