[cfe-dev] Limits on AST depth and stack usage?

Mon Aug 31 16:35:49 PDT 2020

RecursiveASTVisitor::TraverseStmt has a `DataRecursionQueue *Queue` in its signature to prevent stack overflow by default (as Richard once helpfully explained to me).  ASTMatchers uses the DataRecursionQueue, so that searching for matches in Stmt children should never overflow the stack, I believe (though it does confuse users who expect a more orderly traversal…but that’s another matter).

But `memoizedMatchesAncestorOfRecursively` does *not* use data recursion, as you’ve noted.  So in other words it is using data recursion when traversing descendants of deeply nested Exprs, but not when traversing the ancestors of those at the bottom.  

So, I think it could just be a limited problem with a simple solution: there should be a non-recursive, loop-based version of memoizedMatchesAncestorOfRecursively specific to traversing the ancestors of Stmts/Exprs, mirroring what RecursiveASTVisitor::TraverseStmt does.

Whether there are more general issues with e.g. evaluating extremely deep ASTs, is beyond my own depth, though...

Dave

> On Aug 31, 2020, at 11:33 AM, Sam McCall via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> (doh, obviously that first "constexpr" shouldn't be there)
> 
> On Mon, Aug 31, 2020 at 5:22 PM Sam McCall <sammccall at google.com <mailto:sammccall at google.com>> wrote:
> This program creates an AST of depth 10000:
> 
> constexpr 
> template <class T, T...V> struct seq {
>   constexpr bool zero() { return (true && ... && (V == 0)); };
> };
> constexpr unsigned N = 10000;
> auto x = __make_integer_seq<seq, int, N>{};
> static_assert(!x.zero(), "");
> 
> On my machine, clang-10 parses this successfully (slowly!) but crashes if I increase to 100000.
> I suppose -ftemplate-depth doesn't apply as there's no actual recursion here (just fold-expressions), but I feel like some implementation limit *should* apply: clang relies heavily on being able to traverse the AST recursively, and will inevitably crash if the tree is too tall.
> 
> Should there be a limit here? It seems tempting to say one of:
>  -ftemplate-depth should constrain either the number of arguments a pack can represent
>  -ftemplate-depth should constrain the size of pack that can be unfolded
>  -some limit should constrain the overall AST height
> 
> ---
> 
> The immediate problem we're dealing with is that https://reviews.llvm.org/D85621 <https://reviews.llvm.org/D85621> increased the size of DynTypedNode from 32->56 and we're seeing clang-tidy crashes due to stack overflow in the recursive traversal MatchASTVisitor::memoizedMatchesAncestorOfRecursively.
> 
> However this is on a unit test very similar to the above (with N=2000) so before trying to squeeze stack usage I'd like to understand what we should be aiming to support.
> 
> (Tangentially related: Richard, if you think there are good ways to squeeze TemplateArgumentLoc, that'd be interesting to know. Currently it's by-value and holds TemplateArgument and TemplateArgumentLocInfo which are 24 bytes each)
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200831/ebfe48eb/attachment.html>