[cfe-commits] [Patch 4 of 5] Proposed fix for PR12850

Richard Smith richard at metafoo.co.uk
Fri Aug 24 18:51:16 PDT 2012


Hi Andy,

Editorializing somewhat, I think it's unfortunate that people expect
constexpr evaluation to be asymptotically more efficient than the runtime
evaluation of the same functions, and I'm a little concerned that this
caching will lead people to write code which has surprising poor
performance in the cases where a constexpr function gets invoked
dynamically. But... g++ already implements a similar optimization, so we'll
probably need this to handle code written for it.

Have you measured the impact of this cache on "well-written" constexpr
functions (those not requiring memoization for efficiency)? What's the
memory usage and performance impact on,
say, test/SemaCXX/constexpr-turing.cpp, with a more complex Turing machine?
Can you provide some figures for typical and worst-case slowdowns?

There is a nasty corner case where the value of a global changes between
successive calls to a constexpr function (this happens if a variable is
read after it's zero-initialized but before it's dynamically initialized,
from a constexpr function called within its own initializer). Your patch
doesn't seem to handle that (though perhaps I just missed it).

Have you considered using llvm::FoldingSet for the cache? That would be
particularly useful if we ever want to allow arbitrary literal types as
non-type template arguments, or other such cases which might require
reducing an APValue to a FoldingSetNodeID.

There seems to be a fair amount of APValue copying going on in this change.
For large APValues, that's not cheap. Perhaps you could allocate the cache
entry up-front, and evaluate the function arguments and return value
directly into its APValue slots, or swap them in?

Does this have an impact on the constexpr recursion limit? For instance, if
f(500) recurses to depth 500, and g(500) recurses to depth 500 then calls
f(500), will the result of calling g(500) depend on whether I've called
f(500) already in this translation unit? That seems like a highly
unfortunate property, but it's consistent with how template instantiations
behave at least.

Thanks for looking into this!
Richard

On Wed, Aug 22, 2012 at 9:29 AM, Andy Gibbs <andyg1001 at hotmail.co.uk> wrote:

> Hi,
>
> Attached here is the fourth of five patches for PR12850.
>
> This patch makes some alterations to VerifyDiagnosticConsumer (it seems I
> cannot leave it alone!) which are there to enable test-cases to check on
> the performance of the cache.  An additional -verify directive, namely
>
>   verify-fncache-stats
>
> is supported which generates a diagnostic note to the user or test-case
> which details the number of cache hits, the number of cached entries and
> the memory consumption (approximate).  This can be matched then with an
> appropriate "expected-note" in a test-case.
>
> I don't know whether people think this is a good way to achieve this,
> but I was looking for a way to particularly have the hits/count as part
> of test-cases to ensure no unintended performance regressions, and this
> seemed the best approach.
>
> I would welcome alternatives if this is not the way to do it.
>
> Cheers
> Andy
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120824/ef679193/attachment.html>


More information about the cfe-commits mailing list