[cfe-dev] C++ default arguments patch, rework Parse-Sema interaction for function parameters

Chris Lattner clattner at apple.com
Mon Apr 7 22:32:54 PDT 2008


On Apr 7, 2008, at 6:11 AM, Doug Gregor wrote:
>> Of the two, I like #1 the best: it is very explicit, and it doesn't  
>> require
>> cloning around ASTs unnecessarily.  The down side about it is that  
>> clients
>> have to handle one more node type, but I don't think this is a  
>> killer.
>> Being an explicit node like this makes it easier to handle, and we  
>> keep the
>> invariant that the number of arguments to a call is >= the of formal
>> arguments for the callee.  What do you think?
>
> That's reasonable. The updated patch contains a new node
> CXXDefaultArgExpr; most of the changes in this patch just deal with
> that new node in the statement printer, serialization, code generator,
> etc.

Looks great, I like this approach.  Instead of dwelling on details and  
forcing you to keep the patch up to date, I committed it here (after  
resolving a few conflicts that came up in the short time since you  
sent it out!):
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080407/005079.html

Some more nit-picky stuff is included below.

> The code generator bits were no fun at all, because I ended up having
> to duplicate the handling of CXXDefautArgExpr for CGExprComplex,
> CGExprConstant, CGExprScalar, and CGExprAgg. I hope there's a better
> way to do this, but I didn't dig deep into the CodeGen module before I
> convinced myself there wasn't. CXXDefaulArgExpr is a *great* candidate
> for a pre-CodeGen lowering phase :)

:) The funny thing about it is that the code being lowered isn't  
*exactly* the same (the methods return different things).  In llvm- 
gcc, we actually only have two code paths (one for scalars and one for  
aggregates) instead of three... and they are actually interlaced  
together and active in a context sensitive way.  Not having a codepath  
specialized for _Complex means that we get substantially inferior code  
in some cases, and having them interlaced together makes the code much  
more complex and difficult to understand.

I won't claim with certainty that the approach taken by the clang  
codegen is the one will end up with, but it's a step forward.  A  
bigger issue is that a lot of it has to be duplicated (same in llvm- 
gcc) for constant initializers, which is annoying, but doesn't affect  
default arguments at least. :)

> There's a new test for the CodeGen bits, but I don't know how to
> automate the checks we need. If you build it with
> -DCLANG_GENERATE_KNOWN_GOOD, it writes out the default arguments
> explicitly at the call site; otherwise, it lets the front end fill in
> the default arguments. Ideally, we would build this test both ways and
> "diff" the byte-code produced from each (that's what I did to verify
> the correctness of the CodeGen additions), but I didn't implement that
> in the test harness.

The codegen tests currently mostly test that the compiler doesn't  
crash on some language feature.  I think this is fairly reasonable  
(and follows what we do with llvm-gcc).  More comprehensive tests will  
include the whole programs from the llvm-test suite, and ultimately  
building llvm itself with clang ;-).

> [Aside: StmtPrinter did me a great favor by failing to link when I
> forgot to add in the CXXDefaultArgExpr case. The serialization code
> and CodeGen modules, and perhaps even the isConstantExpr/isLvalue/etc
> routines, could all benefit from this kind of built-in safety.]

This is a great idea.  Some of the code could probably be auto- 
generated as well.  I'll add it to my (huge) todo list.

>> @@ -768,7 +817,39 @@ Sema::ActOnDeclarator(Scope *S, Declarat
>>
>> +    // When we have a prototype for a named function, create the
>> +    // parameter declarations for NewFD. C++ needs these early, to
>> +    // deal with merging default arguments properly.
>>
>> I'm not sure what this is doing.  It seems that it could be  
>> factored better
>> somehow, but I'm not exactly sure what is going on.  Can you give  
>> an example
>> of when this is needed?
>>
> I've updated the comment to this:
>
>    // Copy the parameter declarations from the declarator D to
>    // the function declaration NewFD, if they are available.
>
> Basically, the parameter declarations have all been created for the
> function declarator (D)... the parser has called ActOnParamDeclarator
> for each parameter, and put them into the function declarator. This
> code is copying those parameter declarations into the FunctionDecl for
> the function. We used to create the parameter declarations at the same
> time we filled in the FunctionDecl (at definition time), but now we do
> it as early as possible.

Ok!

>> Ok, I guess my basic objection is the duplication of this logic:
>>
>> +      if (FTI.NumArgs == 1 && !FTI.isVariadic &&  
>> FTI.ArgInfo[0].Ident == 0
>> &&
>> +          FTI.ArgInfo[0].Param &&
>> +
>> !((ParmVarDecl*)FTI.ArgInfo[0].Param)->getType().getCVRQualifiers()  
>> &&
>> +          ((ParmVarDecl*)FTI.ArgInfo[0].Param)->getType()- 
>> >isVoidType()) {
>>
>> Can this be factored out into a predicate method and shared with  
>> the other
>> code that does it?
>
> At the moment, this is the only place that does this check.

Ah ok!  Please disregard my objection, the code looked familiar so I  
was sure it was copied from somewhere.  I didn't realize you also  
*deleted* the original copy.

> It used to
> be in ActOnStartOfFunctionDef, but I've moved it here. Granted, I
> don't like that we do this test in Sema... I would much rather handle
> (void) parameter lists in the parser, and just put a flag on the decl
> to say whether "void" was given to denote an empty parameter list (or,
> to be very picky, we could have an EmptyParamListType node that has
> the actual "type" the user wrote, if any). Handling it in the parser
> makes especially good sense for C++ (and perhaps C89? I don't have the
> C standards handy) where the parameter list has to be spelled
> (void)... it can't be (T) where T is a typedef of void as I think it
> can in C99.


I'm not sure about that.  clang tracks typedefs (and lack thereof)  
well enough that Sema can decide whether it was spelled "(void)" or  
"(T)".  Also, we do have to handle the "(T)" case for C99.  It seems  
to me that handling this in the parser would require handling it in  
both the parser and Sema.



Here are some of the increasingly minor nit picky details from your  
patch:

+++ include/clang/AST/ExprCXX.h	(working copy)

I changed the comment above the CXXDefaultArgExpr to use ///'s so that  
doxygen picks it up.


+    // Retrieve the parameter that the argument was created from
+    const ParmVarDecl *getParam() const { return Param; }

Please terminate sentence comments with a '.'.  Yes it's pedantic, but  
it makes the comments more consistent and read more nicely :).  This  
occurs in many places in your patch.

+    virtual SourceRange getSourceRange() const {
+      return Param->getDefaultArg()->getSourceRange();
+    }

I don't think that this is right.  The source range of the  
CXXDefaultArgExpr should be empty, not the source range of the input.   
The default argument doesn't occur in the source, so it shouldn't have  
a location.

+++ lib/AST/ExprCXX.cpp	(working copy)

+// CXXDefaultArgExpr
+Stmt::child_iterator CXXDefaultArgExpr::child_begin() {
+  return reinterpret_cast<Stmt**>(Param->getDefaultArg());
+}

This is subtly wrong.  The child iteration stuff should walk the  
expression tree in the current function, it shouldn't "jump the gap  
here".  I suspect that -ast-dump will have problems with  
CXXDefaultArgExpr's due to this.  I'd just have CXXDefaultArgExpr just  
return an empty child range.


  +void
+Sema::ActOnParamDefaultArgument(DeclTy *param, SourceLocation EqualLoc,
+                                ExprTy *defarg) {
...
+    // Some default arguments were missing. Clear out all of the
+    // default arguments up to (and including) the last missing
+    // default argument, so that we leave the function parameters
+    // in a semantically valid state.
+    for (p = 0; p <= LastMissingDefaultArg; ++p) {
+      ParmVarDecl *Param = FD->getParamDecl(p);
+      if (Param->getDefaultArg()) {
+        delete Param->getDefaultArg();
+        Param->setDefaultArg(0);
+      }
+    }

Should these ParmVarDecl's be marked as erroneous?  This might be a  
good idea if there are secondary errors that can occur because they  
are missing default initializers.  I can't think of any specific  
examples off the top of my head though, your call.

+++ lib/Sema/SemaExpr.cpp	(working copy)

    // Make the call expr early, before semantic checks.  This  
guarantees cleanup
    // of arguments and function on error.
-  llvm::OwningPtr<CallExpr> TheCall(new CallExpr(Fn, Args, NumArgs,
+  if (getLangOptions().CPlusPlus && FDecl && NumArgs < FDecl- 
 >getNumParams())
+    NumArgsPassed = FDecl->getNumParams();
+  llvm::OwningPtr<CallExpr> TheCall(new CallExpr(Fn, Args,  
NumArgsPassed,
                                                   Context.BoolTy,  
RParenLoc));

I don't think this is safe.  The CallExpr ctor reads the range  
[Args,Args+NumArgsPassed), so changing NumArgsPassed in this way makes  
it potentially read beyond the end of the args array.


+    // If too few arguments are available (and we don't have default
+    // arguments for the remaining parameters), don't make the call.
+    if (NumArgs < NumArgsInProto) {
+      if (getLangOptions().CPlusPlus &&
+          FDecl &&
+          FDecl->getParamDecl(NumArgs)->getDefaultArg()) {
+        // Use default arguments for missing arguments
+        NumArgsToCheck = NumArgsInProto;
+      } else
+        return Diag(RParenLoc, diag::err_typecheck_call_too_few_args,
+                    Fn->getSourceRange());
+    }

This logic is reasonable, but what do you think about adding a new  
method to FunctionDecl, something like "unsigned  
getMinRequiredArguments() const" or something like that, which would  
scan the argument list for default arguments.  I think this is a  
generally useful method, and could simplify this down to:

if (NumArgs < NumArgsInProto) {
   if (FDecl && NumArgs >= FDecl->getMinRequiredArguments())
      ...

which would be simpler :).  It could also be used to simplify the  
logic above.


+++ lib/Sema/SemaType.cpp	(working copy)
@@ -394,7 +394,7 @@ QualType Sema::GetTypeForDeclarator(Decl
          llvm::SmallVector<QualType, 16> ArgTys;

          for (unsigned i = 0, e = FTI.NumArgs; i != e; ++i) {
-          QualType ArgTy =  
QualType::getFromOpaquePtr(FTI.ArgInfo[i].TypeInfo);
+          QualType ArgTy = ((ParmVarDecl *)FTI.ArgInfo[i].Param)- 
 >getType();


Please do the cast into a temporary 'ParmVarDecl*' at the top of the  
loop instead of doing the cast three times in the loop.

+++ lib/Sema/SemaDeclObjC.cpp	(working copy)

  +  // Introduce all of the othe parameters into this scope
    for (unsigned i = 0, e = MDecl->getNumParams(); i != e; ++i) {

typo "othe", should end with a period.

+++ lib/AST/Expr.cpp	(working copy)

@@ -1009,6 +1021,9 @@ bool Expr::isNullPointerConstant(ASTCont
      // Accept ((void*)0) as a null pointer constant, as many other
      // implementations do.
      return PE->getSubExpr()->isNullPointerConstant(Ctx);
+  } else if (const CXXDefaultArgExpr *DefaultArg =  
dyn_cast<CXXDefaultArgExpr>(this)) {
+    // See through default argument expressions
+    return DefaultArg->getExpr()->isNullPointerConstant(Ctx);
    }

80 columns.

+++ lib/Parse/ParseDecl.cpp	(working copy)
@@ -1232,8 +1232,10 @@ void Parser::ParseParenDeclarator(Declar

  ///         declaration-specifiers declarator
+/// [C++]   declaration-specifiers declarator '=' assignment-expression
  /// [GNU]   declaration-specifiers declarator attributes
  ///         declaration-specifiers abstract-declarator[opt]
+/// [C++]   declaration-specifiers abstract-declarator[opt] '='  
assignment-expression
  /// [GNU]   declaration-specifiers abstract-declarator[opt] attributes

80 columns.

Thanks again Doug, this is great work!

-Chris




More information about the cfe-dev mailing list