[cfe-dev] Decls are not synonyms for the symbols they represent

Ted Kremenek kremenek at apple.com
Tue Sep 16 14:41:19 PDT 2008


A few weeks ago I had a conversation with Daniel about the fact that  
the ASTs (or other clang data structures) have no notion of the  
"entity" (for lack of a better word) that a declaration represents.

Here are a couple examples of what I mean:

(example 1)

   extern double x;
   extern double x;

Both of these are variable declarations that reference the same  
variable.  There is no notion of the variable itself other than the  
declarations, which is conflated, particularly since we have multiple  
declarations in this case (i.e., there is no unique "entity" for the  
variable).

(incidentally, clang crashes on this input: http://llvm.org/bugs/show_bug.cgi?id=2760)


(example 2)

   struct s;
   struct s { int a; };
   struct s;

Until a few weeks ago, these struct declarations were represented by a  
single RecordDecl with a unique RecordType.  Now they are represented  
by three separate RecordDecls with a shared, unique RecordType.

With structures, the unique RecordType indeed can be treated as  
representing the "struct" itself, which seems fine since the given  
declarations are just type declarations.  So in this case, we *do*  
have a unique "entity" in the ASTs to represent what the declarations  
refer to.  There are still some issues with this representation, but I  
will delay mentioning them until after the next example.


(example 3)

int f();
int f();
int f() { return 0; }

int g();
int g() { return 1; }

For this example, we have separate FunctionDecls for each one of these  
declarations.  In this example, all of the declarations both 'f' and  
'g' share the same type (note that this is different from the case  
with structs).  For the case of 'f', all of its FunctionDecls are  
chained together, and the same goes for 'g'.  There is, however, no  
notion of an entity or concept in the ASTs or other clang data  
structures that represent 'f' itself.

Here is an example of why not having an explicit concept for 'f', 'g',  
or any symbol is problematic.

Consider:

   extern int h(int* x) __attribute__((nonnull));
   extern int h(int *x);
   extern int h(int* x) __attribute__((noreturn));

This code is completely valid.  In the ASTs we create three  
FunctionDecls, the first having the attribute "nonnull" attached to it  
(and object of type NonNullAttr) and the third having the attribute  
"noreturn" attached to it (an object of type NoReturnAttr).

Suppose I had a client (e.g., code generation, static analysis) that  
wanted to know all the attributes attached to a given function.  How  
would I go about doing this?  Given one of these FunctionDecls, I  
would have to iterate the chain of FunctionDecls and query each one of  
its attributes.  This seems a little cumbersome, and causes separate  
clients to have to implement their own logic for querying information  
about "symbols" in a translation unit.  It also causes clients to  
think about internal representations such as the fact that  
FunctionDecls are chained, something we may wish to change at any  
moment in the future.

This email isn't really a proposal of a solution; I'm just raising an  
issue to see if anyone has any comments.  After the last few weeks  
I've been excited by our discussions of DeclGroups and TypeSpecifiers  
that will solve many of the remaining issues with faithfully  
representing syntax in the ASTs.  At the same time, I think we need to  
pay a little more attention to the semantics, and providing  
infrastructure that would be useful for many clients.

Indeed, some of our changes to improve our capturing of syntax have  
actually weakened some of our clients reasoning about semantics.  For  
example, by splitting separate struct declarations into multiple  
RecordDecls we actually (initally) broke CodeGen because the CodeGen  
library assumed that there was a direct 1-1 mapping between a  
RecordDecl and the concept it represented.  That particular case was  
easily resolved by using the RecordType instead of the RecordDecl to  
represent the 'struct', but I'd be willing to wager that there are  
other issues that haven't surfaced yet because RecordTypes are being  
used in this way (by all clients).

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20080916/091431fb/attachment.html>


More information about the cfe-dev mailing list