[cfe-dev] extern "C" support

Sat Jan 5 15:14:55 PST 2008

Mike Stump wrote:
> On Jan 3, 2008, at 1:10 PM, Bjørn Roald wrote:
>>> -ast-print uses that information to print the ast. Absent that 
>>> information we can't faithfully reproduce the syntax tree.  Do we 
>>> care that the printed ast is wrong?  Do we care if the ast isn't 
>>> actually an ast?
>
>> Does this support nested linkage specifiers as well?
>
> I don't see why it would not, though, I didn't test it.  I did write 
> it to work in general, including that case.

Good, I was just wandering :-)

> [ testing ] Yup, works just fine.
Good,
I will see how much time I am able to use on this, but right now I have 
made up a few more test cases.  I try to work myselves systematically 
through the cases in

http://david.tribble.com/text/cdiffs.htm

from the "Changes to C99 versus C++98" and onword. I.e: 
http://david.tribble.com/text/cdiffs.htm#C90-vs-CPP98

I figure we just as well sort these conflict issues out before we go on 
to the C++ specific stuff.  If a case is already handled I just make the 
appropriate test cases and move on to the next issue.  This includes 
making tests for C90 and C99 to verify that differences is honored in 
-pedantic. If I find cases that are not supported I intend to post on 
the list before I will considder to start working on them.  I am a 
little unclear how aggressive (incremental)  I should be in posting  
test code I have added.  I do not get a good feel for how much focus  
the maintainers with SVN write-access have on applying  patches.  So it 
does not make sence to post patches and additions to the list if nobody 
pick up the tab.  Is bugzilla a better way to avoid stuff get lost?

>> Should we not let support for linkage be a backend issue (e.g. llvm 
>> CodeGen), not frontend, as it is now, the AST is unusable for 
>> backends supporting linking to other languages.
>
> There is an extra not above...

Well, yes - such negation is common in my native language to emphasise a 
point, funny thing really :-)  But I should probabbly  avoid it in 
English, sorry.

>> This would mean a change of Sema::ActOnLinkageSpec to support more 
>> lang_xxx enums and/or  pass the language string to the AST consumer.
>
>> And let the backend codegen emit diagnostics, something similar to 
>> pseudocode:
>
>> any thoughts ?
>
> Sure, my thought would be, if one has a plug in for a code generator 
> that supports a new language, call it Pascal, if you want, it does 
> seem reasonable to have the supported by a previously compiled front 
> end.  Do we have such a clean architecture?  

I don't know if we do.  The obvious choise is to attempt to keep a 
complete list of relevant programming languages as enums.  That would 
not give any added cost, but  danger of missing a language or two.  This 
would probbably not be very likely if we kan get a  list of , let us say 
the 20 most likely languages, and language binding technologies.  One 
challenge  is however that  we also need the correct authorative 
language string.  The C++ standard offer some guidance here.

If we are to pass the string or make them accessable to AST consumers,  
the following  thoughts come to mind (before I have dug in):

-  pass the string in  Linkage Specifier AST node  --> very costly and 
probbably inefficient.

-  lett the AST consumer, if it needs to, deal with lang_other by 
accessing the source through sorce location info in AST  --> maybe not a 
bad idea in combination of a fairly complete list of enums.

- Use some efficient way of refering to language strings as is done with 
types or identifiers.  Maybe there is something that can more or less be 
reused as is in the AST, I have not digged in to look. 
> There is a cost in that though, space, ownership of strings and the 
> like.  Pay now, or pay when someone wants the feature, I'll leave that 
> for others to chime in with.

Fair enough, only thing is to avoid major changes later in the AST.

> Also, if you defer it, do you give up issuing an error when you say:
>
> extern "unknown" static int i;
> void foo() {
> }
>
> and i is deleted by the optimizer because it isn't used?  

I am probably missing something, but it sound like you say we are 
optimizing in Parser and Sema.  Are we?  I thought all this would be in 
the AST no mather what?

> Do we want that or not?  

I guess as far as building libs and apps it may not mather if this 
diagnostics get lost.  For other use-cases of the AST, I guess it depend 
on the use-case if it should care.  Some use-cases should probably never 
emit diagnostics as there is no such concept as an unsuported language.  
Say you do AST based pretty printing, code refactoring, or do 
edit/build-time introspection - what defines if linkage to a language is 
legal?  But then it may be wrong for the AST information being lost in 
the first place, diagnostics or not.

> Anyway, I'd rather get the entire rest of the core of the language 
> done than fiddle the fine details.  If written well, they are flexible 
> enough to change.
>
> I did postpone it out of the parser so that one could have a parser 
> parse the above, if they wanted.  :-)

Fair enogh. I don't intend to touch CodeGen for now, so I will most 
definently leave it  ;-)

-- 
Bjørn