[cfe-commits] [PATCH] C++0x range-based for loop

Wed Apr 13 16:04:23 PDT 2011

Hi,

Attached is an updated for-range patch. The main differences from the
previous patch are noted below.

On Sun, March 27, 2011 11:57, Douglas Gregor wrote:
> On Mar 21, 2011, at 12:56 AM, Richard Smith wrote:
>> The attached patch adds support for C++0x's range-based for loops to
>> clang, according to the current draft (N3242). Specifically:
>>
>> A new Stmt subclass, CXXForRangeStmt, has been added. This class
>> provides two views of the for-statement: a partially-desugared
>> representation (just enough desugaring is done to be able to perform the
>> semantic checks needed), and an implicit source-as-written
>> representation (extracted from the partially-desugared representation).
>>
>> In order to avoid tentative parsing, the parsing of the
>> for-range-declaration is performed by ParseSimpleDeclaration. The
>> for-range-initializer is parsed there too, in order to ensure that the
>> for-range-declaration is not in scope in the initializer: "int a =
>> {1,2,3}; for (auto a : a)" appears to be legal.
>>
>> Most of the semantic analysis for a for-range statement is performed
>> before the body is parsed, so that the type of the loop variable can be
>> deduced (in case it contains 'auto'). Attaching the body is performed
>> as a separate finalization step.
>>
>> For a dependently-typed range expression, much of the semantic analysis
>> is deferred until template instantiation time. A new flag has been added
>> to VarDecl to indicate whether it's a for-range-declarator. This flag is
>> used to skip ActOnUninitializedDecl (specifically, while instantiating a
>>  template containing a dependent for-range statement); such declarators
>> are never actually uninitialized, but appear that way in the AST if the
>> required semantic analysis could not be performed.
>>
>> As a small extension, this implementation allows the deduced 'begin'
>> and 'end' iterators to be of differing types; this allows for-range to
>> be used for iterators where it's easy to determine whether an iterator
>> is at the end, but difficult or expensive to construct an explicit
>> iterator object representing the 'end', or to compare general iterators.

This extension has been removed: it didn't operate correctly during
template instantiation.

>> The wording for for-range statements seems likely to change at the
>> upcoming WG21 meeting in Madrid (per N3257). Resolution 2 for this
>> change can be implemented by changing the NeedsADL flag from true to
>> false in BuildForRangeBeginEndCall in SemaStmt.cpp.

This patch implements the finalized C++11 wording from N3271.

> @@ -1005,6 +1006,328 @@
> ForLoc, RParenLoc));
> }
>
>
> +namespace {
> +
> +/// Build a call to 'begin' or 'end' for a C++0x for-range statement.
> +/// FIXME: This is likely to be changed as the result of paper N3257.
>
>
> We now have final wording for this, which just adds a new step before the
> ADL-only lookup of begin/end. So I'll just comment on the parts that are
> the same.
>
> +static ExprResult BuildForRangeBeginEndCall(Sema &SemaRef, Scope *S,
> +                                            SourceLocation Loc,
> +                                            const char *Name, Expr *Arg)
> {
> +  // Find the appropriate begin or end function.
> +  IdentifierInfo *II = &SemaRef.PP.getIdentifierTable().get(Name);
> +  LookupResult R(SemaRef, II, Loc, Sema::LookupOrdinaryName);
> +
> +  if (SemaRef.StdNamespace)
> +    SemaRef.LookupQualifiedName(R, SemaRef.getStdNamespace());
>
>
> Qualified name lookup into namespace "std" is not the same as adding
> "std" as an associated namespace for ADL (there are differences w.r.t.
> using directives, for example). I suggest modeling the handling of "std"
> exactly as the working paper specifies.

This has been fixed. A new flag has been added to UnresolvedLookupExpr to
add namespace std to the usual set of associated namespaces for the call.

> +  if (R.isAmbiguous())
> +    return ExprError();
> +
> +  CXXScopeSpec SS;
> +  // In order to implement N3257 resolution 2, change this 'true' to
> 'false'.
> +  ExprResult Func = SemaRef.BuildDeclarationNameExpr(SS, R,
> /*NeedsADL=*/true);
> +  if (Func.isInvalid())
> +    return ExprError();
>
>
> Since we won't actually be performing any non-ADL name lookup for 'begin'
> or 'end', most of this can go away. It's probably best just to build the
> UnresolvedLookupExpr directly.

Done.

> Index: include/clang/AST/ASTContext.h
> ===================================================================
> --- include/clang/AST/ASTContext.h	(revision 127980)
> +++ include/clang/AST/ASTContext.h	(working copy)
> @@ -422,6 +422,10 @@
> CanQualType DependentTy;
> CanQualType ObjCBuiltinIdTy, ObjCBuiltinClassTy, ObjCBuiltinSelTy;
>
>
> +  // Types for deductions in C++0x [stmt.ranged]'s desugaring.
> +  QualType AutoDeductTy;     // Deduction against 'auto'.
> +  QualType AutoRRefDeductTy; // Deduction against 'auto &&'.
>
>
> It looks like we're missing serialization/deserialization code for these
> type nodes. Also, can we build them lazily via some accessors, rather
> than forcing initialization early in the bring-up of ASTContext? Most
> programs won't ever need them.

Done.

> +  // Deduce the type for the implicit range variable.
> +  QualType RangeType;
> +  if (Range->getType()->isVoidType() ||
> +      !DeduceAutoType(Context.AutoRRefDeductTy, Range, RangeType)) {
> +    // Deduction can fail, eg if the range is an overloaded function.
> +    Diag(Range->getLocStart(), diag::err_for_range_deduction_failure)
> +      << Range->getType() << Range->getSourceRange();
> +    // Pretend range type is dependent in order to allow us to carry on.
> +    RangeType = Context.AutoRRefDeductTy;
> +  }
>
> Pretending that the range type is dependent works fine when we're in a
> template, but doing so in non-template code can be problematic: there
> should be no dependent types within a non-template function, even when
> that function is ill-formed. I suggest bailing out earlier, or marking
> failure in some other way.

Done, we just bail out now.

> +    if (BeginExpr.get()->getType()->isVoidType() ||
> +        !DeduceAutoType(Context.AutoDeductTy, BeginExpr.get(),
> BeginType)) {
> +      Diag(RangeLoc, diag::err_for_range_iter_deduction_failure)
> +        << BeginExpr.get()->getType();
> +      NoteForRangeBeginEndFunction(*this, BeginExpr.get(), BEF_begin);
> +    }
>
> Do we really need the check for 'void' here? Can it be folded into
> DeduceAutoType, since essentially every user of DeduceAutoType will have
> to do a 'void' check?

We don't have to check for 'void' here; we do so in order to provide a
better diagnostic: "'void' type cannot be used as a range" rather than a
mysterious error about forming a reference to void.

> +    // C++0x [decl.spec.auto]p6: BeginType and EndType must be the same.
> We
> +    // allow them to differ as an extension, for structures where
> iterators +    // can tell whether they have reached the end, and where an
> explicit 'end' +    // iterator would be expensive to build.
> +    if (Context.getCanonicalType(BeginType) !=
> +        Context.getCanonicalType(EndType)) {
> +      Diag(RangeLoc, diag::ext_for_range_begin_end_types_differ);
> +      NoteForRangeBeginEndFunction(*this, BeginExpr.get(), BEF_begin);
> +      NoteForRangeBeginEndFunction(*this, EndExpr.get(), BEF_end);
> +    }
>
>
> It's a little cleaner to use "if(!Context.hasSameType(BeginType,
> EndType))".

Done.

> As a general note, most of the various expressions and initializers are
> full expressions, but you don't seem to be calling ActOnFinishFullExpr at
> all. That will likely manifest as a bug where temporaries created by
> those expressions won't be destroyed. Which ties in with...
>
> @@ -1175,15 +1201,19 @@
> if (Body.isInvalid()) return StmtError();
>
> -  if (!ForEach)
> -    return Actions.ActOnForStmt(ForLoc, LParenLoc, FirstPart.take(),
> SecondPart,
> -                                SecondVar, ThirdPart, RParenLoc,
> Body.take());
> +  if (ForEach)
> +    // FIXME: It isn't clear how to communicate the late destruction of
> +    // C++ temporaries used to create the collection.
>
>
> Although this FIXME is specifically written for the Objective-C for
> collection statement, it seems that the same issue affects the C++0x
> range-for.  How about gathering the set of temporaries still active after
> processing the range expression and placing them into the
> CXXForRangeStmt? CodeGen can then create a special cleanup scope for
> these temporaries.

There are quite a few different places where for-range creates
temporaries, and they have various different lifetime requirements. Most
of the temporaries were already being cleaned up by AddInitializerToDecl
(however, some code reordering was required to get the cleanup to pick the
right temporaries). The two remaining full-expressions, __begin != __end
and ++__begin, were both missing ActOnFinishFullExpr calls, which I've now
added.

I've also added a test to make sure we create and destroy the temporaries
at the same points as in the corresponding desugaring.

> It would be nice to have a test for the AST read/write of
> CXXForRangeStmts.

I've added such a test.

Please let me know what more needs doing!

Thanks,
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-for-range-2.diff
Type: text/x-patch
Size: 103984 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20110414/802a1d55/attachment.bin>