r178105 - [modules] Re-enable the "ambiguous expansion of macro" warning.

Argyrios Kyrtzidis akyrtzi at gmail.com
Wed Nov 13 09:20:27 PST 2013


On Nov 12, 2013, at 6:03 PM, Richard Smith <richard at metafoo.co.uk> wrote:

> On Tue, Nov 12, 2013 at 12:29 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
> 
> On Oct 30, 2013, at 6:41 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> 
>> On Wed, Oct 30, 2013 at 5:32 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>> 
>> On Oct 28, 2013, at 1:13 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>> 
>>> On Mon, Oct 28, 2013 at 12:15 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> On Sat, Oct 26, 2013 at 10:14 AM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>> As I said above, I think there's a misunderstanding here; to reiterate: when you say "I want to use the interface provided by M1. If I've not imported M2" this is case 1) I mentioned and I agree with what you said should happen. If this does not happen currently then it should be fixed.
>>> 
>>> Should we focus on what should happen when importing both M1 and M2 ?
>>> 
>>> Yes, I think refocusing will help. In particular, I think this is the key question on which we have different views:
>>>  
>>> How do I know that two modules are *related* ? One module may have imported the other but if it doesn't re-export it it might as well be an *unrelated* one from my perspective; why do I need to care what modules it imported internally ?
>>> 
>>> There are (I think) two different ways that two modules can have a conflict on what a macro name means:
>>> 
>>> 1) Both modules want the name to mean something (which be mean "it's a macro expanding to X" or might be "it's not a macro").
>>> 2) One module wants to *override* what another module meant the name to mean (M2 says "it expands to FOO BAR", and M1 says "actually, I override M2 and say it expands to BAZ")
>>> 
>>> There are also two different ways that a module can express its intent for a macro name to mean something:
>>> 
>>> A) It can #define the name to something
>>> B) It can #undef the name
>>> 
>>> I think we agree that:
>>> 
>>> 1A: If two modules want the same macro name to be defined to different things, and that macro name is used, that is clearly ambiguous and should be diagnosed.
>>> 
>>> I'm on the fence about this one:
>>> 
>>> 1B: If one module wants the macro name to be #defined to something, and another module wants it to be undefined, currently Clang silently picks the definition. I am not entirely convinced that's appropriate -- if my module explicitly #undef's a macro, that's a statement of intent: in my interface, I'm using that name to mean something that's not a macro. I don't think it's obvious that:
>>> 
>>>   #undef X
>>>   #define X X
>>> 
>>> and 
>>> 
>>>   #undef X
>>> 
>>> should mean fundamentally different things here. Either way, I'm saying "I own the macro name X and it produces the token sequence X."
>>> 
>>> Hmm, I think it's actually rather important that we diagnose this: on the upgrade path from #includes to modules, we'll get a silent change of behavior without this.
>>> 
>>> If we do diagnose this case (where two imported modules disagree on a macro: one says it should be defined and the other says it should not), then I care a lot less about case 2. (It's enough for me that we diagnose this; we don't need to preserve the existing behavior.)
>>> 
>>> That said, I think we should take as a guiding rule that modules and submodules should behave the same, and we currently violate that in the case of macros.
>>>  
>>> Case 2 is where we really seem to disagree. It's also a weird case for modules, since we rarely have cases in which the interface of one module includes "modify the interface of another module". Which brings me to your next point:
>>>  
>>> I'm suggesting that If I explicitly import 2 modules, and the modules' interfaces disagree on what an identifier means, then there is ambiguity.
>>> One module may or may not have imported the other module internally, but that doesn't change the fact that the modules' interfaces disagree on the meaning of an identifier.
>>> 
>>> The key thing for me is, if one module imports another module, undefines some part of it, then re-exports it, it seems pretty clear that it intends to *modify* the interface of that first module. This happens in other guises too: extending overload sets, specializing templates, shadowing a tag name with a non-tag name, and so on. In all other cases, importing both modules gives you the modified interface *with no ambiguity*. The ambiguity has already been resolved by the dominating module saying how to resolve it. And if I import a module that says that it modifies another module's interface, I expect to get that modified interface.
>>> 
>>> So: 2A) if M2 defines our macro, and M1 imports M2 and undefines and redefines that macro, and either I import M1 or I import *both* M1 and M2, I've expressed my intent to use M1's resolution to the conflict, and I think I should get M1's definition.
>>> 2B) if M2 defines our macro, and M1 imports M2 and undefines the macro, and either I import M1 or I import both M1 and M2, I think the macro should not be defined.
>>> 
>>> By exporting a macro undefinition that undefines M2's macro, M1 is saying "importing me undefines M2's macro definition". There's no ambiguity in that, and if I import both modules, that's what I expect to happen. This would behave in (largely) the same way we behave across submodules in the same module, which makes the intuitive model simpler. If M1 didn't want this behavior, it should not export the macro.
>>> 
>>> 
>>> Let me make my proposal a bit more concrete, since I think I've been vague up until now.
>>> 
>>> Instead of storing a single flat list of macro definitions and undefines, we store a DAG. The predecessors of each directive are the set of directives that were visible at the point of the directive. (For a #define, these must all be either #undefs or #defines to the same value.) When we see a macro name being used, we take the set of visible leaves of this graph, check they are consistent[1], and use that consistent value as the value of the macro. Otherwise, we diagnose.
>>> 
>>> [1] I'm not certain whether consistency should allow a mixture of #undefs and #defines, but I'm inclined to think it should not.
>>> 
>> 
>> Richard, thanks for the detailed explanation, I'm convinced that if a module undefines a macro of another module and re-exports it, the macro should be considered undefined even if I import both modules.
>> I have another question, what if the module doesn't re-export the imported module ? Specifically, M1 imports M2, undefines a macro from M2, but then it doesn't re-export module M2 ? That seems like a sufficiently different case, to be considered separately.
>> 
>> Yes, I agree that that case needs separate consideration.
>> 
>> I think it's clear that:
>> - If you import only M1, the macro should not be defined
>> - If you import only M2, the macro should be defined
>> 
>> More interesting questions:
>>  - If you import M1 and M3 (where M3 defines the same macro), either the macro should be defined or any use of it should be ambiguous. Which?
>>  - If you import M1 and M2, then either (1) macro is not defined, (2) macro is defined, or (3) uses of macro are ambiguous might be reasonable. Which?
>> 
>> 
>> The user-facing model I'm currently thinking about is:
>>  * Preprocessor directives in modules are entities in their own right (every definition and undefinition of a macro is a distinct entity).
>>  * These entities are visible if they are not private and the corresponding (sub)module has been imported.
>>  * A #define X or #undef X directive overrides all definitions of X that are visible at the point of the directive.
>>  * A #define or #undef directive is active if it is visible and no visible directive overrides it.
>>  * If a macro name is used, the set of active directives for that macro name must be consistent (either all #undefs, or all #defines defining the macro name to the same value).
> 
> Hi Richard, apologies for taking so long to get back to you.
> 
> I'd like to expand a bit on the user-facing model, particularly regarding a module that imports another module but does not re-export it (M1 imports M2 but does not re-export M2).
> What are the semantics here, is it:
> 	- User imports M1
> 	- M2 is not visible at all (hidden due to non getting exported)
> 	- The fact that M1 imports M2 is strictly an implementation detail, M1 can remove the import of M2 or change to importing M3 and there will be no noticeable change in the user's code (regardless of what modules the user imported).
> ?
> 
> To start off with, we're only talking about the case where M1 undefines M2's macro, and *exports* that undefine. I separately want to move us to a module where preprocessor directives are not exported by default, so (slightly longer-term) I want M1 to need to take explicit action for this to happen. In this case, M1 has exported that (1) the identifier in question is not a macro, and (2) in a conflict between this intent and M2's intent, M1's intent should win.
> 
> So: if M1 re-exports its undefine of the macro, and user code imports both M1 and M2, it will not see the macro. If M1 stops importing M2 before undefining the macro, then it's no longer undefining M2's macro, and the user code will now see an ambiguity between M1's exported undef and M2's exported define.
> 
> If M1 didn't re-export its undefine of the macro, then it's not part of M1's interface, and the preprocessing of the user code is entirely unaffected by it.
> 
> But in either case, if M1 imports M2, and user code imports M1, then M2 is part of the user's program, and the user's program can break if M1 switches from using M2 to using M3. This problem is not specific to macros: imagine both the user program and M3 define a global function named 'foo'. If you transitively include any part of a module, that module is part of your program.
> 
> Also, if M1 importing M2 is an implementation detail that doesn't "leak out" to the user of M1, what about if M1 exports a declaration using a type from M2 ?
> 
> There are a number of issues here, and I'm not sure which one concerns you.

I wanted to see how the macro questions fits with the rest of the user-model.

> The way this works at the moment is that you can implicitly use the declarations of things from M2, but you can't name them or use their definitions. So:
> 
> // M2
> #define s_n S_FIELD_NAME
> struct S { int s_n; };
> 
> // M1
> import M2; // and don't re-export
> 
> S *make();
> void take(S *p);
> void take_val(S s);
> 
> // user code
> import M1;
> 
> int s_n, foo; // ok, not a (visible) macro
> #define s_n foo // ok, no conflict
> int bar = s_n; // ok, assigns from foo
> #undef s_n // ok, undefines our s_n, not M2's
> 
> int main() {
>   take(make()); // ok
>   auto *p = make(); // ok
>   S *p = make(); // error, 'S' is not visible
>   take_val(*make()); // error, 'S::S' is not visible
>   return make()->s_n; // error, definition of 'S' is not visible
> }
> 
> import M2;
> int k = make()->s_n; // ok, S's definition is visible and M2's s_n macro is active now
> 
> It doesn't matter that S's field is not actually called 's_n', because S's definition is visible only when S is visible. The existence of the type 'S' leaks out of M1, but its name *mostly* does not. This will fail, though:
> 
> import M1;
> struct S {}; // error, multiple definitions of S.

Thanks again for the detailed explanation. I think that your proposed model for macros is consistent with the general model, and I have no concerns, thanks!

>> That gives the following answers to the above:
>> 
>> If M2 contained a #__private_macro directive for the macro name:
>>  - If you import M1 and M3, where M3 defines the same macro that M1 undefines, the macro is defined to M3's value (that is the only visible directive).
>>  - If you import M1 and M2, the macro is defined to M2's value (that is the only visible directive).
>>  - (For completeness) if you import M1, M2, and M3, any use of the macro is ambiguous.
>> (That is, a #__private_macro undefinition has no effect.)
>> 
>> If M2 did not contain a #__private_macro directive for the macro name:
>>  - If you import M1 and M3, where M3 defines the same macro that M1 undefines, then any use of the macro is ambiguous (because M1 said "it's not a macro", and M3 said "it's this macro").
>>  - If you import M1 and M2, then the macro is not defined (M2's definition is not active, because M1's undefinition is visible and overrides it).
>>  - If you import M1, M2, and M3, this is the same as the first case. Any use of the macro is ambiguous: the active directives are M1's undefinition and M3's definition, and those conflict.
>> 
>> I think I'm comfortable with those answers. For case of importing both M1 and M2, another option would be to treat it like importing M1 and M3. However, that leads to a situation where (M1 re-exports M2 and I import M1) behaves differently from (M1 does not re-export M2 and I import both M1 and M2), which seems surprising.
>> 
>> 
>> What do you think? Are you comfortable with the above answers, or do you think we need to go in a different direction for this case?
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131113/636ac973/attachment.html>


More information about the cfe-commits mailing list