[cfe-dev] Limits on AST depth and stack usage?

Sam McCall via cfe-dev cfe-dev at lists.llvm.org
Tue Sep 1 03:12:25 PDT 2020


Thanks very much both, the data recursion was the biggest thing I was
missing, but all the extra details are very helpful.
(Richard: I might try paraphrase the "3 techniques" section of your email
into some documentation if that's OK)

It sounds like the low-hanging fruit here is:
A) make BracketDepth limit foldexprs (to 256 by default)
B) have ASTMatchFinder use explicit iteration rather than recursion when
walking back up the tree (similar to data recursion)
C) heap-allocate TemplateArgumentLocInfo::Template (introducing a
discriminator enum in the low bits)

Haojian and I will take these on, I think we understand well enough what
needs to be done.

Any one of these would fix our clang-tidy problems. Side benefits: A should
make tools safer by limiting ways to create huge ASTs, and C will mean
DynTypedNote is 40 bytes rather than 56 now and 32 before, reduce overall
memory usage, and let us add a few member-of-union asserts while there.

> As for limiting the size of packs in general: I think it might make sense
to impose a limit -- not to avoid excessive recursion; we should have other
mechanisms to deal with that -- but to avoid huge memory usage during
compilation due to forming overly-large packs.

This makes sense, I might not pick this up though. I don't have much idea
sense of the programs likely to be affected (large packs that are neither
recursively expanded nor unfolded...)

On Tue, Sep 1, 2020 at 5:46 AM Richard Smith <richard at metafoo.co.uk> wrote:

> Since you asked about more general issues:
>
> In general terms, it would be nice if we could accept arbitrarily-deep
> ASTs. We make some effort to do so, with three major efforts (in addition
> to expecting 8MB of stack space in each thread that might run Clang):
>
> 1) Data recursion. RecursiveASTVisitor does this by default, and the
> constant evaluator does it when evaluating integer-valued expressions. But
> this is limited: you only get it where you apply it, and RAV's automatic
> data recursion turns itself off when it thinks the difference might be
> observable in subclasses.
> 2) Minimizing stack space during recursion. For example, during "regular"
> function template instantiation at the end of the translation unit, we try
> to keep the size of the recursive frame as small as possible.
> 3) The "runWithSufficientStackSpace" mechanism that tries to check that
> "enough" stack space is available and pops out a diagnostic (and as a
> slow-but-correct fallback, allocates a new stack) if we're getting close to
> running out.
>
> Applying these techniques (largely in the above order) should be
> encouraged, and if we can make memoizedMatchesAncestorOfRecursively do so,
> that seems like a good directed approach for the local problem. But it's
> unlikely that we'll ever cover enough ground here to consistently prevent
> ever running out of stack in all cases -- not unless we fully move away
> from recursion as an implementation technique, which seems implausible, or
> add "runWithSufficientStackSpace" calls to every call graph SCC, which
> seems expensive. So this is currently done pragmatically: we fix stack
> usage issues as we see them become a problem. And in the remaining cases,
> we crash :-( Not the end of the world for a compiler, but not great for
> more or less any other user of Clang as a library.
>
> For standards-conformance, our backstop here are so-called "implementation
> limits" -- we're permitted to bail out if things get too complex, and we
> largely get to define what that means. C has some minimal requirements here
> -- for example, that we can support at least one program with 63 levels of
> nested parentheses; C++ has only suggestions and no hard requirements. But
> in either case, if our limits are violated, there are no constraints on our
> behavior; we're allowed to crash or whatever. So that puts us in
> quality-of-implementation land.
>
> Now, we do have some defined implementation limits (and there has been
> some work done towards improving and formalizing them:
> https://reviews.llvm.org/D72053). In the specific case of expanding a
> fold-expression, the relevant implementation limit is
> LangOptions::BracketDepth, which limits the total number of parentheses,
> braces, or square brackets that can be open at once, but that limit is
> currently only applied when parsing, not during template instantiation. So
> in this specific case, as a bare minimum, I think we're missing a check for
> that language option during fold-expression expansion (each level of a
> fold-expression introduces a set of parentheses, so even a pedantic
> interpretation of the rules has us covered reusing the same limit here).
>
> As for limiting the size of packs in general: I think it might make sense
> to impose a limit -- not to avoid excessive recursion; we should have other
> mechanisms to deal with that -- but to avoid huge memory usage during
> compilation due to forming overly-large packs. Eg, passing a very large
> integer to the __make_integer_seq builtin can easily cause Clang to fall
> into a hole from which it will never escape, assuming you don't run out of
> address space and just crash.
>
> Setting an implementation limit on the overall depth of the AST could be
> interesting, if there's a way we can practically do it. This seems like a
> hard problem -- there are just too many different places and ways in which
> AST nodes are created, many of them aren't equipped to cope with the
> operation failing in some kind of gracefully-recoverable way, and just
> keeping track of the AST depth without paying a code complexity and memory
> usage cost seems like a challenge -- and we'd need to be sure the cost of
> solving it is worth the benefit.
>
> On Mon, 31 Aug 2020 at 16:35, David Rector <davrecthreads at gmail.com>
> wrote:
>
>> RecursiveASTVisitor::TraverseStmt has a `DataRecursionQueue *Queue` in
>> its signature to prevent stack overflow by default (as Richard once
>> helpfully explained to me).  ASTMatchers uses the DataRecursionQueue, so
>> that searching for matches in Stmt children should never overflow the
>> stack, I believe (though it does confuse users who expect a more orderly
>> traversal…but that’s another matter).
>>
>> But `memoizedMatchesAncestorOfRecursively` does *not* use data recursion,
>> as you’ve noted.  So in other words it is using data recursion when
>> traversing descendants of deeply nested Exprs, but not when traversing the
>> ancestors of those at the bottom.
>>
>> So, I think it could just be a limited problem with a simple solution:
>> there should be a non-recursive, loop-based version of
>> memoizedMatchesAncestorOfRecursively specific to traversing the ancestors
>> of Stmts/Exprs, mirroring what RecursiveASTVisitor::TraverseStmt does.
>>
>> Whether there are more general issues with e.g. evaluating extremely deep
>> ASTs, is beyond my own depth, though...
>>
>> Dave
>>
>>
>> On Aug 31, 2020, at 11:33 AM, Sam McCall via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>> (doh, obviously that first "constexpr" shouldn't be there)
>>
>> On Mon, Aug 31, 2020 at 5:22 PM Sam McCall <sammccall at google.com> wrote:
>>
>>> This program creates an AST of depth 10000:
>>>
>>> constexpr
>>> template <class T, T...V> struct seq {
>>>   constexpr bool zero() { return (true && ... && (V == 0)); };
>>> };
>>> constexpr unsigned N = 10000;
>>> auto x = __make_integer_seq<seq, int, N>{};
>>> static_assert(!x.zero(), "");
>>>
>>> On my machine, clang-10 parses this successfully (slowly!) but crashes
>>> if I increase to 100000.
>>> I suppose -ftemplate-depth doesn't apply as there's no actual recursion
>>> here (just fold-expressions), but I feel like some implementation limit
>>> *should* apply: clang relies heavily on being able to traverse the AST
>>> recursively, and will inevitably crash if the tree is too tall.
>>>
>>> Should there be a limit here? It seems tempting to say one of:
>>>  -ftemplate-depth should constrain either the number of arguments a pack
>>> can represent
>>>  -ftemplate-depth should constrain the size of pack that can be unfolded
>>>  -some limit should constrain the overall AST height
>>>
>>> ---
>>>
>>> The immediate problem we're dealing with is that
>>> https://reviews.llvm.org/D85621 increased the size of DynTypedNode from
>>> 32->56 and we're seeing clang-tidy crashes due to stack overflow in the
>>> recursive traversal MatchASTVisitor::memoizedMatchesAncestorOfRecursively.
>>>
>>> However this is on a unit test very similar to the above (with N=2000)
>>> so before trying to squeeze stack usage I'd like to understand what we
>>> should be aiming to support.
>>>
>>> (Tangentially related: Richard, if you think there are good ways to
>>> squeeze TemplateArgumentLoc, that'd be interesting to know. Currently it's
>>> by-value and holds TemplateArgument and TemplateArgumentLocInfo which are
>>> 24 bytes each)
>>>
>> I think it should be possible to squeeze TemplateArgument down to 16
> bytes, but it'll be a little fiddly. All representations are already 16
> bytes (including the kind field) except for the Declaration representation,
> and that one is only two pointers plus a kind. We have too many kinds to
> fit into the bottom bits of a pointer (we have 9 kinds currently), but we
> can handle that: all the other representations have a full 4 bytes for the
> "kind" field, so we can use (for example) bottom bit == 0 means it's the
> Declaration kind, and bottom bit == 1 means the remaining 31 bits are the
> actual kind. (Or we can get down to 8 kinds by representing "null" as --
> for example -- a null type template argument, and then pack the kind in the
> first 3 bits.)
>
> The lower-hanging fruit is TemplateArgumentLocInfo, for which we could
> heap-allocate the information for the template name case and reduce the
> class to 8 bytes. Given that template template arguments are the rarest
> kind, that's very likely to be a net memory usage reduction.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200901/fb6d9f22/attachment.html>


More information about the cfe-dev mailing list