[cfe-dev] Question about hashing AST nodes

Whisperity via cfe-dev cfe-dev at lists.llvm.org
Wed Jan 30 01:45:53 PST 2019


First, it depends on whether you want this storage to be persistent.
In case you only do these in one pass in memory, the memory address of
the AST node seems a key enough. To quote, "The abstract syntax tree
is not abstract, not syntax-only and not a tree" - for every usage of
a variable, e.g., you'll be able (assuming of course that it is found
in the same TU, etc...) to query the effective VarDecl (the definition
of the variable) the DeclRefExpr (e.g. a variable's usage) points to.

I think ODRHash was invented (apart from many other "node
semi-equivalence" checks) only to allow for diagnostics, not to act as
a true key. One thing that comes to mind is Modules, where I think
ODRHash is used to diagnose certain things, but last time I read about
them it was said that even ODRHash is lacking at parts.

The Clang Static Analyzer has a hash utility for bug reports to
distinguish between bugs that are in the same "file-line-col" triad.
But maybe that's also really tailored for this "bugpath" the SA emits.

For a persistent hash that can go into perhaps a database, in
CodeCompass we came up with our own, basically: using FNVHash on
manglednames and identifiers, locations, etc.

In case you want just for the current compilation / memory image,
you'll get no better answer than pointer identity and what Sema
creates for you in the AST.
If you want to store, anyone can invent their hash function. The big
question is, how will you make sure that the hash persists between
invocations.

For ==, there is the "structural equivalence" cheks which are used by
Sema and ASTImporter / CrossTU.

Kevin Sullivan via cfe-dev <cfe-dev at lists.llvm.org> ezt írta (időpont:
2019. jan. 30., Sze, 8:27):
>
> Dear Clang Community:
>
> A colleague and I need to use AST nodes as keys in a map. For example, having entered a few variable declaration nodes as keys, linking to associated meta-data, and now looking at a member function application node, with the variables as arguments, we'd like to be able to query the map to find the associated meta-data for those variable objects.
>
> To this end, we need to implement hash and == for AST nodes. That's what's needed, e.g., by the C++ unordered map class.
>
> There's little available information as far as we can tell on this topic. Would you be so kind as to let us know the best way to do this, in your view? Is ODRHash the key to unlocking a solution? Are there other approaches you'd recommend?
>
> Kevin Sullivan
> University of Virginia
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev



More information about the cfe-dev mailing list