[cfe-dev] Decls are not synonyms for the symbols they represent

steve naroff snaroff at apple.com
Wed Sep 17 10:07:27 PDT 2008


On Sep 17, 2008, at 12:50 PM, Argiris Kirtzidis wrote:

> steve naroff wrote:
>>>
>>> (example 3)
>>>
>>> int f();
>>> int f();
>>> int f() { return 0; }
>>>
>>> int g();
>>> int g() { return 1; }
>>>
>>> For this example, we have separate FunctionDecls for each one of  
>>> these declarations.  In this example, all of the declarations both  
>>> 'f' and 'g' share the same type (note that this is different from  
>>> the case with structs).  For the case of 'f', all of its  
>>> FunctionDecls are chained together, and the same goes for 'g'.   
>>> There is, however, no notion of an entity or concept in the ASTs  
>>> or other clang data structures that represent 'f' itself.
>>>
>>
>> I don't really understand what you mean by "f itself". In the  
>> example above, we have two identical function declarations and one  
>> function (for "f"). This accurately reflects the source code (which  
>> is our goal). I could imagine higher level convenience functions  
>> that might be useful for some clients, however I think the AST is  
>> fundamentally correct in this instance.
>
> This is what I think too, it's better if the AST is a simple as  
> possible, more higher level semantic information could be build on  
> top of the AST, instead of getting embedded in.
> For example, there could be a framework like the analysis framework,  
> that gets "fed" ASTs from multiple translation units and can reason  
> about declarations as a whole.
> A client could query it to determine that the "f" function is  
> declared 'here' in this file and defined 'there' in that file.
>

Another simple/relevant example is a browser index/database (for an  
IDE, say)...where the notion of "project scope" comes into play. We  
need to resist the natural tendency to fold all associations/data into  
the AST. I'm not suggesting that's what Ted was implying...just re- 
making a point.

snaroff

> -Argiris
>
>>
>>> Here is an example of why not having an explicit concept for 'f',  
>>> 'g', or any symbol is problematic.
>>>
>>> Consider:
>>>
>>>  extern int h(int* x) __attribute__((nonnull));   extern int h(int  
>>> *x);
>>>  extern int h(int* x) __attribute__((noreturn));
>>> This code is completely valid.  In the ASTs we create three  
>>> FunctionDecls, the first having the attribute "nonnull" attached  
>>> to it (and object of type NonNullAttr) and the third having the  
>>> attribute "noreturn" attached to it (an object of type  
>>> NoReturnAttr).
>>>
>>> Suppose I had a client (e.g., code generation, static analysis)  
>>> that wanted to know all the attributes attached to a given  
>>> function.  How would I go about doing this?  Given one of these  
>>> FunctionDecls, I would have to iterate the chain of FunctionDecls  
>>> and query each one of its attributes.  This seems a little  
>>> cumbersome, and causes separate clients to have to implement their  
>>> own logic for querying information about "symbols" in a  
>>> translation unit.  It also causes clients to think about internal  
>>> representations such as the fact that FunctionDecls are chained,  
>>> something we may wish to change at any moment in the future.
>>>
>>
>> As far as the AST's go, I really don't see the hardship here. The  
>> fact that the FunctionDecls are chained accurately reflects the  
>> source code...doesn't it? For me, the problem with the chain is  
>> memory efficiency (more than convenience). In C, it is fairly  
>> uncommon to have more than one function decl for the same name (yet  
>> every FunctionDecl has a chain!). Nevertheless, we already have  
>> some bloat in FunctionDecl...every prototype has a Body slot  
>> (ouch). Clearly room for improvement here.
>>
>> A related issue which I consider more problematic is the lack of  
>> *any* "chain" for VarDecls. Consider the following code:
>>
>> int i4;
>> int i4;
>> extern int i4;
>>
>> const int a [1] = {1};
>> extern const int a[];
>>
>> extern const int b[];
>> const int b [1] = {1};
>>
>>
>> At the moment, there is no way to get to the previous declaration!  
>> Since I've already whined about the memory inefficiency for  
>> FunctionDecl's, I certainly wouldn't recommend adding a chain for  
>> all VarDecls!
>>
>> I remember writing Sema::CheckForFileScopedRefefinitions() where I  
>> had to deal with this. Fortunately, Sema's "IdResolver" came to the  
>> rescue (thanks Argiris:-). That said, my gut says it might be worth  
>> using an IdResolver-like mechanism to solve this "navigation  
>> problem" for *both* VarDecls and FunctionDecls. Architecturally, it  
>> would make sense for this new API to be part of ASTContext.
>>
>> Thoughts?
>>
>> snaroff
>>
>>> This email isn't really a proposal of a solution; I'm just raising  
>>> an issue to see if anyone has any comments.  After the last few  
>>> weeks I've been excited by our discussions of DeclGroups and  
>>> TypeSpecifiers that will solve many of the remaining issues with  
>>> faithfully representing syntax in the ASTs.  At the same time, I  
>>> think we need to pay a little more attention to the semantics, and  
>>> providing infrastructure that would be useful for many clients.
>>>
>>> Indeed, some of our changes to improve our capturing of syntax  
>>> have actually weakened some of our clients reasoning about  
>>> semantics.  For example, by splitting separate struct declarations  
>>> into multiple RecordDecls we actually (initally) broke CodeGen  
>>> because the CodeGen library assumed that there was a direct 1-1  
>>> mapping between a RecordDecl and the concept it represented.  That  
>>> particular case was easily resolved by using the RecordType  
>>> instead of the RecordDecl to represent the 'struct', but I'd be  
>>> willing to wager that there are other issues that haven't surfaced  
>>> yet because RecordTypes are being used in this way (by all clients).
>>>
>>> Thoughts?
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>




More information about the cfe-dev mailing list