[cfe-dev] Decls are not synonyms for the symbols they represent

Argiris Kirtzidis akyrtzi at gmail.com
Wed Sep 17 09:50:16 PDT 2008


steve naroff wrote:
>>
>> (example 3)
>>
>> int f();
>> int f();
>> int f() { return 0; }
>>
>> int g();
>> int g() { return 1; }
>>
>> For this example, we have separate FunctionDecls for each one of 
>> these declarations.  In this example, all of the declarations both 
>> 'f' and 'g' share the same type (note that this is different from the 
>> case with structs).  For the case of 'f', all of its FunctionDecls 
>> are chained together, and the same goes for 'g'.  There is, however, 
>> no notion of an entity or concept in the ASTs or other clang data 
>> structures that represent 'f' itself.
>>
>
> I don't really understand what you mean by "f itself". In the example 
> above, we have two identical function declarations and one function 
> (for "f"). This accurately reflects the source code (which is our 
> goal). I could imagine higher level convenience functions that might 
> be useful for some clients, however I think the AST is fundamentally 
> correct in this instance.

This is what I think too, it's better if the AST is a simple as 
possible, more higher level semantic information could be build on top 
of the AST, instead of getting embedded in.
For example, there could be a framework like the analysis framework, 
that gets "fed" ASTs from multiple translation units and can reason 
about declarations as a whole.
A client could query it to determine that the "f" function is declared 
'here' in this file and defined 'there' in that file.

-Argiris

>
>> Here is an example of why not having an explicit concept for 'f', 
>> 'g', or any symbol is problematic.
>>
>> Consider:
>>
>>   extern int h(int* x) __attribute__((nonnull)); 
>>   extern int h(int *x);
>>   extern int h(int* x) __attribute__((noreturn)); 
>>
>> This code is completely valid.  In the ASTs we create three 
>> FunctionDecls, the first having the attribute "nonnull" attached to 
>> it (and object of type NonNullAttr) and the third having the 
>> attribute "noreturn" attached to it (an object of type NoReturnAttr).
>>
>> Suppose I had a client (e.g., code generation, static analysis) that 
>> wanted to know all the attributes attached to a given function.  How 
>> would I go about doing this?  Given one of these FunctionDecls, I 
>> would have to iterate the chain of FunctionDecls and query each one 
>> of its attributes.  This seems a little cumbersome, and causes 
>> separate clients to have to implement their own logic for querying 
>> information about "symbols" in a translation unit.  It also causes 
>> clients to think about internal representations such as the fact that 
>> FunctionDecls are chained, something we may wish to change at any 
>> moment in the future.
>>
>
> As far as the AST's go, I really don't see the hardship here. The fact 
> that the FunctionDecls are chained accurately reflects the source 
> code...doesn't it? For me, the problem with the chain is memory 
> efficiency (more than convenience). In C, it is fairly uncommon to 
> have more than one function decl for the same name (yet every 
> FunctionDecl has a chain!). Nevertheless, we already have some bloat 
> in FunctionDecl...every prototype has a Body slot (ouch). Clearly room 
> for improvement here.
>
> A related issue which I consider more problematic is the lack of *any* 
> "chain" for VarDecls. Consider the following code:
>
> int i4;
> int i4;
> extern int i4;
>
> const int a [1] = {1};
> extern const int a[];
>
> extern const int b[];
> const int b [1] = {1};
>
>
> At the moment, there is no way to get to the previous 
> declaration! Since I've already whined about the memory inefficiency 
> for FunctionDecl's, I certainly wouldn't recommend adding a chain for 
> all VarDecls!
>
> I remember writing Sema::CheckForFileScopedRefefinitions() where I had 
> to deal with this. Fortunately, Sema's "IdResolver" came to the rescue 
> (thanks Argiris:-). That said, my gut says it might be worth using an 
> IdResolver-like mechanism to solve this "navigation problem" for 
> *both* VarDecls and FunctionDecls. Architecturally, it would make 
> sense for this new API to be part of ASTContext.
>
> Thoughts?
>
> snaroff
>
>> This email isn't really a proposal of a solution; I'm just raising an 
>> issue to see if anyone has any comments.  After the last few weeks 
>> I've been excited by our discussions of DeclGroups and TypeSpecifiers 
>> that will solve many of the remaining issues with faithfully 
>> representing syntax in the ASTs.  At the same time, I think we need 
>> to pay a little more attention to the semantics, and providing 
>> infrastructure that would be useful for many clients.
>>
>> Indeed, some of our changes to improve our capturing of syntax have 
>> actually weakened some of our clients reasoning about semantics.  For 
>> example, by splitting separate struct declarations into multiple 
>> RecordDecls we actually (initally) broke CodeGen because the CodeGen 
>> library assumed that there was a direct 1-1 mapping between a 
>> RecordDecl and the concept it represented.  That particular case was 
>> easily resolved by using the RecordType instead of the RecordDecl to 
>> represent the 'struct', but I'd be willing to wager that there are 
>> other issues that haven't surfaced yet because RecordTypes are being 
>> used in this way (by all clients).
>>
>> Thoughts?
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>   



More information about the cfe-dev mailing list