[PATCH] Add stopgap option -fmodule-implementation-of

Mon Jul 28 23:56:40 PDT 2014

On Mon, Jul 28, 2014 at 8:01 PM, Richard Smith <richard at metafoo.co.uk>
wrote:

> On Mon, Jul 28, 2014 at 6:25 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>>
>> On Jul 28, 2014, at 5:09 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Mon, Jul 28, 2014 at 2:05 PM, Ben Langmuir <blangmuir at apple.com>
>> wrote:
>>
>>>
>>> On Jul 24, 2014, at 6:58 PM, Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>>
>>> On Thu, Jul 24, 2014 at 7:56 AM, Ben Langmuir <blangmuir at apple.com>
>>> wrote:
>>>
>>>>
>>>> On Jul 16, 2014, at 3:42 PM, Richard Smith <richard at metafoo.co.uk>
>>>> wrote:
>>>>
>>>> On Fri, Jul 11, 2014 at 8:42 AM, Ben Langmuir <blangmuir at apple.com>
>>>> wrote:
>>>>
>>>>> Hey RIchard,
>>>>>
>>>>> Sorry to take so long to reply to this, but I am still interested in
>>>>> getting this stopgap into tree.
>>>>>
>>>>
>>>> Sorry about the delay getting back to you!
>>>>
>>>>
>>>>> Please do not add a stopgap workaround to our stable and
>>>>> backwards-compatible driver interface; just add it to -cc1 instead.
>>>>>
>>>>>
>>>>>  Sure.
>>>>>
>>>>> I don't see any relation between the flag's name and its
>>>>> functionality; there seems to be no reason for this to be linked to the
>>>>> translation unit being the implementation of any particular module (and if
>>>>> there were, that's what -fmodule-name is for). Instead, I think what you're
>>>>> trying to specify is that a particular module is included textually for
>>>>> this compilation. Please pick a name that suggests that functionality
>>>>> instead.
>>>>>
>>>>>
>>>>> In the abstract I agree with this, but the use case I have is only for
>>>>> TUs that are implementation files for a module and I know that is the only
>>>>> time that this flag will be used by our tools.  It is more useful for the
>>>>> diagnostic to say “don’t do this in the implementation of module Foo”,
>>>>> since that matches when the build system will be passing in this flag.
>>>>>  Given that this doesn’t go into the driver, is this still an issue? If
>>>>> not, I can update and commit this patch, or can post it again for review if
>>>>> you prefer :-)
>>>>>
>>>>
>>>> I'm fine with this as a short-term cc1-only flag. Longer-term I think
>>>> we need to evaluate whether we can make the import-of-same-module cases
>>>> "just work" (I think we can), and I hope this becomes unnecessary at that
>>>> point.
>>>>
>>>>
>>>> r213767
>>>>
>>>>
>>>> >>> What’s unexpected to me is that changing a header whose contents
>>>>> are not usually visible may still require rebuilding all of my .cpp files.
>>>>> >>> module Foo { module One { header “One.h” } module Two { header
>>>>> “Two.h” } }
>>>>> >>>
>>>>> >>> // One.cpp - I don’t want to rebuild when Two.h changes
>>>>> >>> #import <Foo/One.h>
>>>>> >>>
>>>>> >>> Do we agree that this is unnecessary if submodules cannot
>>>>> accidentally be affected by changes in other submodules they don’t import
>>>>> (and we have some way to get the set of dependency files for just the
>>>>> submodule)?
>>>>> >>
>>>>> >>
>>>>> >> No, I don't agree with that. One.cpp might inline some function
>>>>> definitions from Two.h, for instance. Or it might fail to build because it
>>>>> declares something that conflicts with something in Two.h.
>>>>> >
>>>>> >
>>>>> > I feel like I”m missing something - how is that different from
>>>>> One.cpp having conflicts with some completely different header or module
>>>>> that is not imported into that particular TU?
>>>>>
>>>>> If you import any part of a module, you have the whole module as part
>>>>> of your translation unit, even though only some of it might be visible.
>>>>> Thus we will diagnose your declarations that conflict with unimported
>>>>> portions of an imported module.
>>>>>
>>>>> Maybe we need to have this discussion on cfe-dev at some point.  I
>>>>> think we need a driver flag to control whether clang reports headers from
>>>>> unimported submodules as dependencies, which will allow users/build systems
>>>>> to make the tradeoff.  As for the default, I strongly feel we shouldn't
>>>>> penalize build performance for correct code in order to guarantee that
>>>>> these particular ODR violations get diagnosed in incremental builds.  A
>>>>> full rebuild will still see any diagnostics and the subset of errors that
>>>>> this affects are not being diagnosed today with headers, so we’re still
>>>>> improving.
>>>>>
>>>>
>>>> Conversely, I think that we should provide a guarantee that incremental
>>>> and full builds produce bit-for-bit identical results. As you say, it's a
>>>> tradeoff, but note that this isn't just about ODR violation checking -- the
>>>> incremental approach you're suggesting can generate wrong code in some
>>>> cases (we can inline a function definition from the old version of Two.h)
>>>> -- so if we want to support this partial-rebuild mode, we'll need to be
>>>> /very/ careful that we don't pull in any information from an unimported
>>>> submodule in that mode.
>>>>
>>>>
>>>> Maybe you can help me understand how this would come about.  In our
>>>> documentation we say:
>>>>
>>>> Modules are modeled as if each submodule were a separate translation
>>>> unit, and a module import makes names from the other translation unit
>>>> visible
>>>>
>>>>
>>>> Here’s my understanding:
>>>> If I don’t import the submodule containing “Two.h”, then I shouldn’t
>>>> get its definitions in my TU.
>>>>
>>>
>>> You get its definitions in your *program*. If you import any part of a
>>> module, the entire module is part of your program. Example:
>>>
>>>
>>> Okay, but that’s just more consistency checking, ins’t it?  If I import
>>> Module1.B, but not Module1.A (or Module2.C) I don’t want to see “f” in my
>>> exported symbols.
>>>
>>
>> I think you're saying that it would in principle be possible for us to
>> accept the example I gave? It probably would, but the fact that we reject
>> it right now is a feature, not a bug.
>>
>>
>> Agreed, although I think we weigh its benefit vs incremental building
>> differently.
>>
>>   Module1.A:
>>> int f(int);
>>>
>>> Module1.B:
>>> extern int n;
>>>
>>> Module2.C:
>>> import Module1.B;
>>> void f(int); // error, conflicting return type
>>>
>>> If I have an inline declaration for a function in Two, then I still need
>>>> to have a definition in my own TU because of inline.  If I have a
>>>> non-inline decl, then Two can’t have an inline decl and if it has a
>>>> definition for the function not marked inline then having that definition
>>>> show up in my TU would lead to multiple definitions if Two is imported
>>>> somewhere else.
>>>>
>>>
>>> You can get into this situation with C++ templates. You might only be
>>> able to see a declaration of a template, where another submodule provides a
>>> definition that is hidden but still available for inlining. This doesn't
>>> violate any language rule as long as there's an explicit instantiation of
>>> the template somewhere.
>>>
>>>
>>> If I don’t see a definition in my TU, how can I use the template in a
>>> way affected by inlining?
>>>
>>
>> You do "see" a definition in your TU, for some value of "see". That
>> definition *is* imported, and is known about by the compiler; we just give
>> you an error if you try to use it. CodeGen is still able to emit it. This
>> is necessary to support entities that are imported by a module but not
>> re-exported.
>>
>>
>> Consider this:
>>
>> Module X:
>>   inline int f() { return 0; }
>> Module Y:
>>   import X; // not re-exported
>>   inline int g() { return f(); }
>>  Z.cc:
>>   import Y;
>>   int k = g();
>>
>> In Z.cc, we are *required* to emit the body of 'f', even though you
>> can't "see" it.
>>
>>
>> Okay, that makes sense.  This is certainly something we would need to
>> account for to do safe incremental rebuilding.  I think the right answer is
>> to make sure that the transitive imports get included in the reported
>> dependencies regardless of being re-exported.
>>
>> And entities in X are treated just like entities in an unimported
>> submodule of Y.
>>
>>
>> Ah.  This seems like an accident of the implementation rather than a
>> desirable property.  We have two distinct cases:
>>
>> 1) A imports B, and B is not re-exported.  B’s headers are still
>> dependencies for our TU even though they aren’t  visible.
>> 2) A has submodules B and C.  Importing A.B does not create a dependency
>> on A.C or vice versa.
>>
>
> I think you mean "should not" rather than "does not" here: under the
> current implementation, it certainly does, in that the contents of A.C can
> affect whether a user of A.B builds today. Even then (as you note above) we
> have a trade-off here; there are benefits to having that dependency.
>
>  I may not have an instantiation of a template, but I still need to see
>>> its definition.  If its definition changes, that would require rebuilding
>>> the other TU that has the instantiation.  I’m probably being thick, but I
>>> still don’t see the issue here.
>>>
>>>
>>> You can also get into this situation with the C99 inline rules, where
>>> you don't have to define an 'inline' function in every translation unit.
>>>
>>>
>>> Did this change in C11, or am I misreading this?
>>> 6.7.4.7: For a function with external linkage, the following
>>> restrictions apply: If a function is declared with an inline function
>>> specifier, then it shall also be defined in the same translation unit.
>>>
>>
>> That rule applies only if the function is declared with the 'inline'
>> specifier in that translation unit. Example:
>>
>> Module X.A:
>>   extern int f(void); // ok, no 'inline', no definition required in this
>> TU
>> Module X.B:
>>   inline int f() { return 0; } // ok, definition
>> main.cc:
>>   import X.A;
>>   int main() { return f(); }
>>
>> In this setup, f() might get inlined into main, even though the
>> definition is not visible. (FWIW, I expect we'll also generate wrong code
>> in this case, because we'll emit a strong definition of 'f' from every TU
>> that imports X; conversely, if X.A and X.B are split into separate
>> top-level modules, then a TU that imports both will not emit a strong
>> definition of 'f’.)
>>
>>
>> I don’t think this is a good idea at all.  I’m okay with saying that
>> you’re not allowed to have conflicting submodules, but having them create
>> implicit dependencies like this violates my mental model for semantic
>> import.  I would much prefer that X.A and X.B behave the same as top-level
>> modules (except that importing X might implicitly pull in A and/or B), and
>> I think that would be much less surprising.
>>
>
> I used to think the same thing, but I don't any more. I think there is
> value in being able to say that a collection of submodules together forms
> some coherent, logically-indivisible whole (call it a "library", maybe?),
> where the submodules just provide visibility control over the pieces of
> that library. Right now, we also couple that to two other things: the
> identity of the "library", and the .pcm file structure, are both determined
> by the top-level module name. I'm not convinced that's a good idea -- there
> are certainly cases where it makes sense to have more granularity than that.
>
> If we could decouple this "same library" / "same .pcm file" decision from
> the top-level module name, so that you could say "X.A and X.B are
> notionally separate (and live in distinct libraries / .pcm files)", would
> that address your concern?
>

I asked something more specific than what I really wanted to know here. In
Clang's current implementation, the top-level module that contains a given
module affects a lot of things. In your X.A / X.B example, which properties
do you want? Off the top of my head:

 1. X.A and X.B are placed into the same .pcm file
  1a. That .pcm file doesn't contain any other top-level module
 2. X.A and X.B are both part of any TU / program that uses either of them
 3. X.A and X.B have names starting with the same prefix
 4. X.A and X.B are notionally in the same "layer", so there's no need to
think about dependency cycles with other modules
 5. X.A and X.B are always built together

(FWIW, I don't think it makes sense for all these things to be tied to the
choice of top-level module name.)

Another point that seems relevant is that implicit module builds are a bad
> idea in a lot of situations. They don't distribute well, they rely on
> side-channels for sharing module files, they break existing build system
> assumptions, they require multiple compile actions to block waiting for
> each other, and so on. A better approach, which we should be encouraging
> people to use, is to make the module build step explicit in the build
> system. Once we treat "building a module" as a build step with its own
> dependencies (which is in turn depended on by downstream .cpp and module
> builds), this incremental rebuild approach becomes rather problematic.
>
> Finally, a point I've raised before is that hermetic builds are important
> to a lot of people: for build reproducibility, cacheability, and so on,
> it's important that your build does *not* depend on the path of builds you
> did previously.
>
> Both these points would be addressed by splitting your X.A and X.B builds
> up so they built separate .pcm files.
>
> What happens when I provide an incompatible external definiton of “f()” in
>> another TU?  We can’t diagnose the conflict
>>
>
> There is no conflict; the C standard says that the implementation gets to
> pick whichever one it likes.
>
> Eventually, I'd like for us to include some IR (representing inline
> function definitions and so on) in the module file, to remove the cost of
> repeatedly generating IR for inline functions within modules. I don't think
> we want the complexity of segregating that IR on the basis of frontend name
> visibility rules.
>
> and we will be calling the inline definition from a module we didn’t
>> import (from the user’s perspective).  Seems at least as bad as the other
>> conflicts we’ve talked about :-)
>>
>> If you actually want the inlining, just make the inline definition
>> visible, or turn on LTO.
>>
>
> Conversely, if you actually want separate entities from a dependency point
> of view, just make different module files for them.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140728/a1e87c42/attachment.html>