[cfe-dev] PATCH: C++ Function Overloading (v2)

Tue Oct 21 08:22:38 PDT 2008

Hi Argiris,

On Tue, Oct 21, 2008 at 6:35 AM, Argiris Kirtzidis <akyrtzi at gmail.com> wrote:
> Doug Gregor wrote:
>> Templates will need to store sets of overloaded functions.
>>
>
> Even now, with no trace of a template implementation in clang, the use of
> OverloadedFunctionDecl is questionable:
>
> -First, consider this example:
>
> void f(char); #1
>
> template<class T> void g(T t)
> {
>  f(t); #2 //dependent
> }
>
> void f(int); #3
>
> void h()
> {
>  g(2); #4 //should cause a call of f(int)
> }
>
>
> At #2 'f' does not have any overloads and no OverloadedFunctionDecl exists,
> how should this be represented ?

It's just a FunctionDecl, because #1 is the only f. That's what we're doing now.

> If #2 is regarded as a call to #1, then we miss #3 at instantiation time
> (#4) (as specified in C++ 14.6p9).
> In order to pickup #3, should we turn 'f' into a OverloadedFunctionDecl,
> even if 'f' nevers gets an overload, just because it gets called in a
> templated expression ?

We aren't supposed to pick up #3. GCC (and most other compilers) gets
this wrong; EDG gets it right. You can try it by tweaking your example
code a bit:

char& f(char); // #1

template<class T> void g(T t)
{
  int& x = f(t); // #2 //dependent
}

int& f(int); // #3

void h()
{
  g(2); // #4 //should cause a call of f(char), so it's ill-formed
}

The relevant paragraph is C++ 14.6.4p1, which specifies which names
are considered when resolving dependent names. The set of names
contains the names we found via name lookup at template definition
time (#1) plus any names we find via argument dependent lookup at
template definition time or at template instantiation time. This
latter rule means that if you tweak the example to use a user-defined
type rather than 'int' for #1 and for the call to g(), we will find
#3.

> -Second, I see no reason why IdentifierResolver can't be made to eventually
> work with templates and be used to get the overload set for 'f' at #4 (if it
> turns out that there are overloads)
>
>> I can think of two kinds of clients that might want or need to deal
>> with OverloadedFunctionDecls.
>>
>> First of all, any clients that are focused on parsing and don't need
>> to do much semantic analysis could certainly skip overload resolution,
>> and would therefore leave the OverloadedFunctionDecls in place since
>> they don't need to be resolved.
>>
>> Second, clients that want to do some kind of speculative overload
>> resolution would need to query the overloads that show up in the
>> overload set. For example, an IDE that tries to help with overload
>> resolution by showing options while you type. Type "f(17, " and it
>> shows you all of the f's that could still be called. This client needs
>> to be able to ask the AST or Sema (it's not clear which) which "f"'s
>> are available, and then ask Sema which ones are still callable.
>>
>> None of these require the exact OverloadedFunctionDecl formulation in
>> my patch, but they are AST clients to consider.
>>
>
> There was a discussion here:
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2008-September/002841.html
> about how clients should reason about symbols/identifiers. The consensus was
> that Sema builds up a lot of useful semantic information that later
> discards. It would be useful to somehow expose semantic information from
> Sema and avoid bloating the AST with semantic information (as would be the
> case with OverloadedFunctionDecl).
> For example, I think that eventually IdentifierResolver will either be
> exposed by Sema to clients or a similar construct will be used. By your
> example, an IDE usually wants to inquire "I'm in this scope, what can this
> identifier mean in this context ? Give me a set of decls." which is exactly
> the functionality and purpose of IdentifierResolver.

Sure, although it's a slightly different form that what you state. We
need: "give me the set of decls that I would have seen at this
particular point in the translation."

>> The disadvantage of having the 'void*' opaque value is that it makes
>> ownership harder: how does this void* get deallocated, serialized, or
>> de-serialized?
>>
>
> I agree, the opaque value definitely complicates things. How about a
> OverloadExpr that contains a DeclContext* and an IdentifierInfo*

I like DeclContext* + IdentifierInfo*; it's small and efficient. I'm
not thrilled about OverloadExpr, because I think it says the wrong
thing: we really are talking about a set of overloaded function
declarations, and that should be described by a Decl node.

Now, to deal with the template case you provided above, we will have
to extend this DeclContext* + IdentifierInfo* scheme to restrict name
lookup to only those declarations we would have seen at the point in
the translation where the OverloadedFunctionDecl was created. That's
not terribly hard---we could just use some kind of token or sequence
number and filter based on that---but it's the kind of thing that
would be hard to do across translation units, because the
token/sequence number in one translation unit cannot be mapped to a
token/sequence number in another translation unit. So, while
DeclContext* + IdentifierInfo* is serializable, DeclContext* +
IdentifierInfo* + sequence number isn't serializable.

> (for the
> moment, not sure what exact structure would be needed for templates).

There will be (optional) template arguments (e.g., in a call like
"f<int, float>(blah)"), but in this case I think we can ignore those
in our initial design.

> To summarize, my point is that currently OverloadedFunctionDecl is an
> unnecessary overhead since IdentifierResolver already can be used to get the
> set of overloads. If later on (e.g. for templates), we find out that
> IdentifierResolver is not sufficient, we can always add
> OverloadedFunctionDecl too. At the moment it makes little sense to have them
> since the only expressions that are supposed to reference them (and not get
> immediately resolved) are expressions in templates, and there's a long way
> ahead before reaching the "template hill" :-)

As you know, a lot of my design principles involve making sure that
the thing we design now is still the right abstraction for the time
when we implement templates. I certainly don't mind doing refactoring
now or down the road, as having a clean, well-documented AST makes
that easy, but I also don't want to make a trade-off now that we know
will do the wrong thing when templates come along.

> A bit off-topic:
> Currently the iterator abstraction of IdentifierResolver is stretched thin
> and will probably snap when more name lookup rules are added. I lean towards
> removing the iterator and using a single function that returns the set of
> decls based on an identifier.

Iterators can be arbitrarily complex, so it's not that the iterator
abstraction *couldn't* handle this. The thing I like best about the
iterator abstraction is that it doesn't require us to allocate any
memory like returning a set of decls does, and doesn't expose any one
particular data structure. That said, the iterator itself will need to
get some more configurability---follow using directives or don't
follow using directives?, to name one---that might cost us too much in
iteration overhead.

  - Doug