[cfe-dev] Clarification for term "AST"
pidgeot18 at gmail.com
Sat Feb 23 22:28:31 PST 2013
On 2/23/2013 11:56 AM, Larry Evans wrote:
> Now, where are those problems described and where is the rationale for
> choosing the clang data structures to solve those problems described?
> With that information, maybe Markus would have a better idea about why
> his suggestions wouldn't work or maybe steer him to an alternative
Let me preface by saying that most of my use of Clang is to do things
that are strictly orthogonal from actually compiling C++ code.
I can't say that I can think of a single time where I would ever want
the raw parsed AST with no semantic analysis done on it. The kinds of
queries you want from a C++ programs tend to start with things like
"find all variables" or "find all functions", yet faithfully following
the grammar (and not combining levels of the tree) would result in a lot
of boilerplate code to do unwrapping, especially when you consider the
contortions needed to represent something like "int (*foo(int))(double)"
as a proper tree. From my standpoint as a user, anytime I have to
duplicate Clang's reasoning of logic, it is a bug that Clang does not
give me an API to do that.
Theoretical purity isn't a particularly useful design goal, since true
purity is often (IMHO) rather pointless. C++11 defines 9 phases of
translations, of which the first 6 can be described as lexing, the 7th
"parse, semantic analysis, code generation, and optimization," the 8th
linking, and the final one the runtime loader. Yet there is no practical
reason to, for example, output the source program exactly as it is right
after folding lines that end in a backslash. Indeed, preserving the
ability to do this action would unnecessarily complicate later pipelines
(you would need to attach cumbersome metadata to source files prior to
actual tokenization, the first step of which happens in the subsequent
Another example I can give is that the grammar represents precedence by
explicitly doing the factoring itself, which means that you would need,
in order to achieve theoretical purity, to wrap every multiplicative
expression in a node which is an unary additive expression. Ignoring
this does fail to achieve theoretical purity, since you are representing
the AST in a manner not identical to the grammar; however, by
sacrificing purity, you are making everybody's lives very much easier.
News submodule owner
More information about the cfe-dev