r178105 - [modules] Re-enable the "ambiguous expansion of macro" warning.

Wed Oct 30 18:41:19 PDT 2013

On Wed, Oct 30, 2013 at 5:32 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com>wrote:

>
> On Oct 28, 2013, at 1:13 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>
> On Mon, Oct 28, 2013 at 12:15 PM, Richard Smith <richard at metafoo.co.uk>wrote:
>
>> On Sat, Oct 26, 2013 at 10:14 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com>wrote:
>>
>>> As I said above, I think there's a misunderstanding here; to reiterate:
>>> when you say "I want to use the interface provided by M1. If I've not
>>> imported M2" this is case 1) I mentioned and I agree with what you said
>>> should happen. If this does not happen currently then it should be fixed.
>>>
>>> Should we focus on what should happen when importing *both* M1 and M2 ?
>>>
>>
>> Yes, I think refocusing will help. In particular, I think this is the key
>> question on which we have different views:
>>
>>
>>> How do I know that two modules are *related* ? One module may have
>>> imported the other but if it doesn't re-export it it might as well be an
>>> *unrelated* one from my perspective; why do I need to care what modules it
>>> imported internally ?
>>>
>>
>> There are (I think) two different ways that two modules can have a
>> conflict on what a macro name means:
>>
>> 1) Both modules want the name to mean something (which be mean "it's a
>> macro expanding to X" or might be "it's not a macro").
>> 2) One module wants to *override* what another module meant the name to
>> mean (M2 says "it expands to FOO BAR", and M1 says "actually, I override M2
>> and say it expands to BAZ")
>>
>> There are also two different ways that a module can express its intent
>> for a macro name to mean something:
>>
>> A) It can #define the name to something
>> B) It can #undef the name
>>
>> I think we agree that:
>>
>> 1A: If two modules want the same macro name to be defined to different
>> things, and that macro name is used, that is clearly ambiguous and should
>> be diagnosed.
>>
>> I'm on the fence about this one:
>>
>> 1B: If one module wants the macro name to be #defined to something, and
>> another module wants it to be undefined, currently Clang silently picks the
>> definition. I am not entirely convinced that's appropriate -- if my module
>> explicitly #undef's a macro, that's a statement of intent: in my interface,
>> I'm using that name to mean something that's not a macro. I don't think
>> it's obvious that:
>>
>>   #undef X
>>   #define X X
>>
>> and
>>
>>   #undef X
>>
>> should mean fundamentally different things here. Either way, I'm saying
>> "I own the macro name X and it produces the token sequence X."
>>
>
> Hmm, I think it's actually rather important that we diagnose this: on the
> upgrade path from #includes to modules, we'll get a silent change of
> behavior without this.
>
> If we do diagnose this case (where two imported modules disagree on a
> macro: one says it should be defined and the other says it should not),
> then I care a lot less about case 2. (It's enough for me that we diagnose
> this; we don't need to preserve the existing behavior.)
>
> That said, I think we should take as a guiding rule that modules and
> submodules should behave the same, and we currently violate that in the
> case of macros.
>
>
>> Case 2 is where we really seem to disagree. It's also a weird case for
>> modules, since we rarely have cases in which the interface of one module
>> includes "modify the interface of another module". Which brings me to your
>> next point:
>>
>>
>>> I'm suggesting that If I explicitly import 2 modules, and the modules'
>>> interfaces disagree on what an identifier means, then there is ambiguity.
>>> One module may or may not have imported the other module internally, but
>>> that doesn't change the fact that the modules' interfaces disagree on the
>>> meaning of an identifier.
>>>
>>
>> The key thing for me is, if one module imports another module, undefines
>> some part of it, then re-exports it, it seems pretty clear that it intends
>> to *modify* the interface of that first module. This happens in other
>> guises too: extending overload sets, specializing templates, shadowing a
>> tag name with a non-tag name, and so on. In all other cases, importing both
>> modules gives you the modified interface *with no ambiguity*. The ambiguity
>> has already been resolved by the dominating module saying how to resolve
>> it. And if I import a module that says that it modifies another module's
>> interface, I expect to get that modified interface.
>>
>> So: 2A) if M2 defines our macro, and M1 imports M2 and undefines and
>> redefines that macro, and either I import M1 or I import *both* M1 and M2,
>> I've expressed my intent to use M1's resolution to the conflict, and I
>> think I should get M1's definition.
>> 2B) if M2 defines our macro, and M1 imports M2 and undefines the macro,
>> and either I import M1 or I import both M1 and M2, I think the macro should
>> not be defined.
>>
>> By exporting a macro undefinition that undefines M2's macro, M1 is saying
>> "importing me undefines M2's macro definition". There's no ambiguity in
>> that, and if I import both modules, that's what I expect to happen. This
>> would behave in (largely) the same way we behave across submodules in the
>> same module, which makes the intuitive model simpler. If M1 didn't want
>> this behavior, it should not export the macro.
>>
>>
>> Let me make my proposal a bit more concrete, since I think I've been
>> vague up until now.
>>
>> Instead of storing a single flat list of macro definitions and undefines,
>> we store a DAG. The predecessors of each directive are the set of
>> directives that were visible at the point of the directive. (For a #define,
>> these must all be either #undefs or #defines to the same value.) When we
>> see a macro name being used, we take the set of visible leaves of this
>> graph, check they are consistent[1], and use that consistent value as the
>> value of the macro. Otherwise, we diagnose.
>>
>> [1] I'm not certain whether consistency should allow a mixture of #undefs
>> and #defines, but I'm inclined to think it should not.
>>
>
>
> Richard, thanks for the detailed explanation, I'm convinced that if a
> module undefines a macro of another module and re-exports it, the macro
> should be considered undefined even if I import both modules.
> I have another question, what if the module doesn't re-export the imported
> module ? Specifically, M1 imports M2, undefines a macro from M2, but then
> it doesn't re-export module M2 ? That seems like a sufficiently different
> case, to be considered separately.
>

Yes, I agree that that case needs separate consideration.

I think it's clear that:
- If you import only M1, the macro should not be defined
- If you import only M2, the macro should be defined

More interesting questions:
 - If you import M1 and M3 (where M3 defines the same macro), either the
macro should be defined or any use of it should be ambiguous. Which?
 - If you import M1 and M2, then either (1) macro is not defined, (2) macro
is defined, or (3) uses of macro are ambiguous might be reasonable. Which?

The user-facing model I'm currently thinking about is:
 * Preprocessor directives in modules are entities in their own right
(every definition and undefinition of a macro is a distinct entity).
 * These entities are *visible* if they are not private and the
corresponding (sub)module has been imported.
 * A #define X or #undef X directive *overrides* all definitions of X that
are visible at the point of the directive.
 * A #define or #undef directive is *active* if it is visible and no
visible directive overrides it.
 * If a macro name is used, the set of active directives for that macro
name must be consistent (either all #undefs, or all #defines defining the
macro name to the same value).

That gives the following answers to the above:

If M2 contained a #__private_macro directive for the macro name:
 - If you import M1 and M3, where M3 defines the same macro that M1
undefines, the macro is defined to M3's value (that is the only visible
directive).
 - If you import M1 and M2, the macro is defined to M2's value (that is the
only visible directive).
 - (For completeness) if you import M1, M2, and M3, any use of the macro is
ambiguous.
(That is, a #__private_macro undefinition has no effect.)

If M2 did not contain a #__private_macro directive for the macro name:
 - If you import M1 and M3, where M3 defines the same macro that M1
undefines, then any use of the macro is ambiguous (because M1 said "it's
not a macro", and M3 said "it's this macro").
 - If you import M1 and M2, then the macro is not defined (M2's definition
is not active, because M1's undefinition is visible and overrides it).
 - If you import M1, M2, and M3, this is the same as the first case. Any
use of the macro is ambiguous: the active directives are M1's undefinition
and M3's definition, and those conflict.

I think I'm comfortable with those answers. For case of importing both M1
and M2, another option would be to treat it like importing M1 and M3.
However, that leads to a situation where (M1 re-exports M2 and I import M1)
behaves differently from (M1 does not re-export M2 and I import both M1 and
M2), which seems surprising.

What do you think? Are you comfortable with the above answers, or do you
think we need to go in a different direction for this case?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131030/7ec142ae/attachment.html>