[cfe-dev] extern "C" support

Chris Lattner clattner at apple.com
Sun Jan 6 13:35:52 PST 2008


On Jan 2, 2008, at 2:22 PM, Mike Stump wrote:
> On Dec 22, 2007, at 5:28 PM, Chris Lattner wrote:
>> +private:
>> +  /// Language - The language for this linkage specification.
>> +  Language language;
>> +  bool inside_braces;
>> +  Decl *D;
>>
>> inside_braces is good for "remembering" the original source,
>
> I did that merely because we're calling them ASTs and because we  
> form ParenExprs.  If we didn't want ASTs, we could get rid of  
> ParenExprs?

I don't understand what you're saying here.  ASTs can take many forms.

>> but unless there is a client,
>
> -ast-print uses that information to print the ast. Absent that  
> information we can't faithfully reproduce the syntax tree.  Do we  
> care that the printed ast is wrong?  Do we care if the ast isn't  
> actually an ast?

You seem to be using an overly constrained definition of what an AST  
is :).  At this point, I'm happy with -ast-print emitting *valid*  
code, but it does not have to be exactly what the user wrote.  In this  
specific case, I'd be fine with -ast-print always emitting {}'s for  
example.

clang does not preserve absolutely everything that the user wrote, and  
I don't think we can achieve that without significant expense to other  
goals.  That said, in this specific case, there is no space overhead,  
so preserving the info is fine.

>> I'd suggest not including it.  How does your Decl structure handle  
>> stuff like:
>>
>> extern "C" {
>> int x;  int y;
>> }
>
> work in progress... but roughly, an AST should have one linkage spec  
> with two decls under it.

Ok.

>> Is there just one LinkageSpecDecl?
>
> Yes.

This makes sense.

>> +++ ./Parse/Parser.cpp	2007-12-21 13:58:52.000000000 -0800
>> @@ -378,6 +378,12 @@ Parser::DeclTy *Parser::ParseDeclaration
>>      return ParseObjCAtInterfaceDeclaration(AtLoc,
>> DS.getAttributes());
>>  }
>>
>> +  if (Tok.is(tok::string_literal)
>> +      && DS.getStorageClassSpec() == DeclSpec::SCS_extern)
>> +    {
>> +      return ParseLinkage(Declarator::FileContext);
>>
>> I'm not sure if this is sufficient:
>
> It wasn't, but that was a known issue.

Please add FIXME markers to everything you *know* is incomplete or  
incorrect.

>> does this allow 'extern int "C"'  or 'extern inline "C"'
>> The DeclSpec::getParsedSpecifiers() method can be used to make sure  
>> there is just a storage class specified.
>
> Ah, perfect, I've fixed it and tested extern inline "C"; that is now  
> rejected.

Nice.

>> +Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
>> +				     std::string Lang,
>> +
>>
>> Please don't pass std::strings by-value, pass by const reference,  
>> or better yet, pass in the string as a const char* or something  
>> like that.
>
> Ok, fixed.

Ok, some more thoughts:

Many of the changes to AST/Decl.cpp and the change to Type.h are  
unrelated to this patch.  It would be good to split them out as they  
aren't controversial at all :)

+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *B,  
LinkageSpecDecl *LS) {
+
+}

Please have this emit an unsupported warning or something (see  
CodeGenModule::WarnUnsupported) so that we know when this is  
happening.  Also, please add a FIXME.

+/// LinkageSpecDecl - This represents a linkage specification.
+class LinkageSpecDecl : public Decl {

As I requested previously, please add an example.


+  /// Language - The language for this linkage specification.
+  Language language;

"Language language" is strange.  I'd suggest renaming the enum to  
LanguageIDs or something.

+  bool inside_braces;

If you want to keep this, please make it a bitfield and pack it with  
the enum.  Also, please rename to IsInsideBraces or something for  
consistency with the rest of the api.

+  bool is_inside_braces() { return inside_braces; }

Please name this 'isInsideBraces()' for consistency with the other API  
and mark it const.

+  Decl *getD() { return D; }

How about getDecls() ?


+
+  virtual DeclTy *ActOnLinkageSpec(SourceLocation Loc, const char *,  
bool inside_braces,
+				   DeclTy *D) {
+    return 0;
+  }

Please stay in 80 columns and use mixed caps instead of _'s.

+++ ./Parse/Parser.cpp	2008-01-02 12:58:14.000000000 -0800
@@ -385,6 +385,13 @@ Parser::DeclTy *Parser::ParseDeclaration
      return ParseObjCAtInterfaceDeclaration(AtLoc, DS.getAttributes());
    }

+  if (Tok.is(tok::string_literal)
+      && DS.getStorageClassSpec() == DeclSpec::SCS_extern
+      && DS.getParsedSpecifiers () ==  
DeclSpec::PQ_StorageClassSpecifier)
+    {
+      return ParseLinkage(Declarator::FileContext);
+    }
+

Ok, but please format with the llvm style (see surrounding code) which  
has the &&'s on the previous line, no space before the (), and either  
drop the {} or make them K&R style.  Likewise in the implementation of  
ParseLinkage.


+// C++ 7.5p2
+Parser::DeclTy *Parser::ParseLinkage(unsigned Context) {
+  std::string lang = PP.getSpelling(Tok);

Please add an assertion that the Tok.is(stringliteral), and a comment  
above the method, like all other parser methods have.

Also, this is the expensive form of getSpelling, because it has to do  
heap traffic to allocate the string and a copy to move it over.   
Instead, please use something like this:

   llvm::SmallVector<char, 8> LangBuffer;
   LangBuffer.resize(Tok.getLength());    // LangBuffer is guaranteed  
to be big enough.
   const char *LangBufPtr = &LangBuffer[0];

   unsigned StrSize = PP.getSpelling(Tok, LangBufPtr);

   ... use LangBufPtr and StrSize ...

In the common case of no cleaning being needed, getSpelling returns a  
pointer into the input buffer to avoid a copy, and no heap allocation  
is required (even with cleaning) unless the language is greater than 8  
characters.

+  if (Tok.is(tok::l_brace))
+    {
+      SourceLocation LBrace = ConsumeBrace();
+      saw_braces = true;
+      while (Tok.isNot(tok::r_brace) && Tok.isNot(tok::eof)) {
+	// FIXME capture the decls.
+	D = ParseExternalDeclaration();
+      }
+
+      SourceLocation RBrace = MatchRHSPunctuation(tok::r_brace,  
LBrace);
+    }
+  else
+    D = ParseDeclaration(Context);

In addition to the formatting issues above, please rearrange this to  
make the small case come first.  This makes the code easier to read  
IMO.  Something like this:

+  if (Tok.isNot(tok::l_brace))
+    D = ParseDeclaration(Context);
+  else
+    {
+      SourceLocation LBrace = ConsumeBrace();
+      saw_braces = true;
+      while (Tok.isNot(tok::r_brace) && Tok.isNot(tok::eof)) {
+	// FIXME capture the decls.
+	D = ParseExternalDeclaration();
+      }
+
+      SourceLocation RBrace = MatchRHSPunctuation(tok::r_brace,  
LBrace);
+    }

Also, RBrace is dead, so you can just remove its declaration.  Better  
yet, you could pass *it* into ActOnLinkageSpec instead of  
'inside_brace'.  We typically prefer to pass all location info into  
the actions module, and have Sema drop it, rather than having the  
parser drop it.  Passing in the location of the {}'s would allow you  
to drop the inside_brace argument from the Action's module because the  
impl could just check to see if they are null or not.

+
+  return Actions.ActOnLinkageSpec(Loc, lang.c_str(), saw_braces, D);

Can D be null here?  If so, you should check for it and return an  
error instead of building an invalid AST node.  One of the invariants  
of the AST is that we don't create bogus ASTs if there is a parse error.

+Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
+				     const char *Lang,
+				     bool inside_brace,
+				     DeclTy *D) {

This looks good, but please clean up the formatting above and add a  
fixme if there are semantic checks missing :)

Thanks Mike,

-Chris




More information about the cfe-dev mailing list