[cfe-dev] Decls are not synonyms for the symbols they represent
Ted Kremenek
kremenek at apple.com
Tue Sep 16 14:41:19 PDT 2008
A few weeks ago I had a conversation with Daniel about the fact that
the ASTs (or other clang data structures) have no notion of the
"entity" (for lack of a better word) that a declaration represents.
Here are a couple examples of what I mean:
(example 1)
extern double x;
extern double x;
Both of these are variable declarations that reference the same
variable. There is no notion of the variable itself other than the
declarations, which is conflated, particularly since we have multiple
declarations in this case (i.e., there is no unique "entity" for the
variable).
(incidentally, clang crashes on this input: http://llvm.org/bugs/show_bug.cgi?id=2760)
(example 2)
struct s;
struct s { int a; };
struct s;
Until a few weeks ago, these struct declarations were represented by a
single RecordDecl with a unique RecordType. Now they are represented
by three separate RecordDecls with a shared, unique RecordType.
With structures, the unique RecordType indeed can be treated as
representing the "struct" itself, which seems fine since the given
declarations are just type declarations. So in this case, we *do*
have a unique "entity" in the ASTs to represent what the declarations
refer to. There are still some issues with this representation, but I
will delay mentioning them until after the next example.
(example 3)
int f();
int f();
int f() { return 0; }
int g();
int g() { return 1; }
For this example, we have separate FunctionDecls for each one of these
declarations. In this example, all of the declarations both 'f' and
'g' share the same type (note that this is different from the case
with structs). For the case of 'f', all of its FunctionDecls are
chained together, and the same goes for 'g'. There is, however, no
notion of an entity or concept in the ASTs or other clang data
structures that represent 'f' itself.
Here is an example of why not having an explicit concept for 'f', 'g',
or any symbol is problematic.
Consider:
extern int h(int* x) __attribute__((nonnull));
extern int h(int *x);
extern int h(int* x) __attribute__((noreturn));
This code is completely valid. In the ASTs we create three
FunctionDecls, the first having the attribute "nonnull" attached to it
(and object of type NonNullAttr) and the third having the attribute
"noreturn" attached to it (an object of type NoReturnAttr).
Suppose I had a client (e.g., code generation, static analysis) that
wanted to know all the attributes attached to a given function. How
would I go about doing this? Given one of these FunctionDecls, I
would have to iterate the chain of FunctionDecls and query each one of
its attributes. This seems a little cumbersome, and causes separate
clients to have to implement their own logic for querying information
about "symbols" in a translation unit. It also causes clients to
think about internal representations such as the fact that
FunctionDecls are chained, something we may wish to change at any
moment in the future.
This email isn't really a proposal of a solution; I'm just raising an
issue to see if anyone has any comments. After the last few weeks
I've been excited by our discussions of DeclGroups and TypeSpecifiers
that will solve many of the remaining issues with faithfully
representing syntax in the ASTs. At the same time, I think we need to
pay a little more attention to the semantics, and providing
infrastructure that would be useful for many clients.
Indeed, some of our changes to improve our capturing of syntax have
actually weakened some of our clients reasoning about semantics. For
example, by splitting separate struct declarations into multiple
RecordDecls we actually (initally) broke CodeGen because the CodeGen
library assumed that there was a direct 1-1 mapping between a
RecordDecl and the concept it represented. That particular case was
easily resolved by using the RecordType instead of the RecordDecl to
represent the 'struct', but I'd be willing to wager that there are
other issues that haven't surfaced yet because RecordTypes are being
used in this way (by all clients).
Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20080916/091431fb/attachment.html>
More information about the cfe-dev
mailing list