[PATCH] Allow multiple modules with the same name to coexist in the module cache

Tue Apr 1 15:32:08 PDT 2014

On Tue, Apr 1, 2014 at 12:59 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:

>
> On Apr 1, 2014, at 12:03 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>
> On Mon, Mar 31, 2014 at 8:35 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:
>
>>
>> On Mar 31, 2014, at 6:00 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Mon, Mar 31, 2014 at 5:25 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:
>>
>>>
>>> On Mar 31, 2014, at 4:44 PM, Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>>
>>> On Mon, Mar 31, 2014 at 4:23 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com
>>> > wrote:
>>>
>>>> >
>>>> > What I'm suggesting is:
>>>> >
>>>> > 1) Drop the -I paths that are earlier than the module in the header
>>>> search path when building the module
>>>> > 2) Include the rest of the header search paths in the configuration
>>>> hash for the module
>>>>
>>>> This will prevent us from sharing system modules across projects, and,
>>>> as Ben already mentioned, will result in an explosion of module files, even
>>>> within the same project.
>>>>
>>>
>>> Why? System modules would only have system include paths, and these
>>> would usually be the same across projects, right?
>>>
>>>
>>> I see what you mean, but we also need to support the case where the
>>> system framework developer can point to the framework in his local
>>> directory to be used, instead of the system one, how do you suggest we
>>> allow this ?
>>>
>>
>> I'd like for a module to be able to specify its own include paths. If a
>> module does so, the include paths of the host build should not be included
>> in its hash.
>>
>>
>> That didn't quite address my question, let me be more specific. We
>> generally need to allow, as an example, system AppKit to import a
>> user-local Foundation; neither system AppKit specifying its own include
>> paths, nor looking up module dependencies of AppKit only in system include
>> paths, allow this.
>>
>
> That's a curious use case.
>
> It also seems to be compatible with the approach I suggested above, I
> think? You need to put your header search paths in the right order (so the
> lower layers of your system are below the higher layers), but that's
> generally desirable regardless. (We have a limitation that -isystem paths
> are considered after -I paths, which might get in the way here, but that's
> artificial and in conflict with what you're trying to do.)
>
>
> What is the right order ?
>
> If the order is:
>
> -I /local/path -isystem /System/Frameworks
>
> and we drop -I paths earlier than the module then module import from
> AppKit doesn't consider -I /local/path.
>
> If the order is:
>
> -isystem /System/Frameworks -I /local/path
>
> then module import from AppKit finds Foundation in /System/Frameworks and
> doesn't consider -I /local/path.
>
>
> Alternatively, this seems like a good use case motivating putting the
> include path into the build configuration. The approach in this patch is a
> bad solution for this use case, because the AppKit modules for different
> Foundations will collide in the cache; any of the other solutions we've
> discussed seem to handle this better.
>
>
> If we don't drop -I paths earlier than the module then putting the '-I'
> paths in the build configuration will cause module explosion.
> If we drop -I paths earlier than the module then it doesn't address the
> above case and the worse thing is that you don't anymore have one module
> hash cache for the whole translation unit anymore, which creates
> complications (e.g. Ben already mentioned the global module index).
>
> We could maybe do something in-between, for example introduce an option
> for search paths specific for module lookups. Normally you wouldn't use
> them but if you did use them then they would go in the build configuration
> and would be considered first for module lookups. Then we could always do
> the "drop -I paths earlier than the module".
>
>
>   Even if we include the search paths in the module hash we will still
>>>> need to re-lookup the module dependencies before loading the module,
>>>> because a new module.map may have showed up somewhere in the search paths
>>>> since the time we built the module; unless I'm missing something, I don't
>>>> see any benefit in including the header search paths in the hash.
>>>
>>>
>>> The hash should include everything that affects the way the module is
>>> built. If we're transferring include paths from the user of the module to
>>> the module build, the hash should include those paths, or two modules with
>>> different search paths could collide in the cache.
>>>
>>>
>>> The collision is avoided by using the path to the module.map as Ben
>>> proposes; if different search paths resolve to different module
>>> dependencies, this is a case where we need to rebuild, but the same is true
>>> if the same paths are used but a different module.map shows up as
>>> dependency.
>>>
>>
>> I don't see how that's sufficient -- the include paths can also affect
>> how non-modular #includes within the module behave.
>>
>> As a somewhat-contrived example, suppose I have a library
>>
>>   blah/
>>     module.map
>>     foo.h
>>     mode1/
>>       foo-impl.h
>>     mode2/
>>       foo-impl.h
>>
>> ... where foo-impl.h is textually-included into foo.h, and this is
>> supposed to be used in two modes: one with a -I path pointing to mode1, and
>> one with a -I path pointing to mode2. We should ensure those two modules
>> don't collide in the cache. (Maybe mode1 and mode2 provide inline assembly
>> for different CPU variants, or such.)
>>
>>
>> Non-modular includes are problematic beyond just how to lookup the
>> headers during module building. The major problem they introduce is that
>> those same non-modular headers can be textually #included in the
>> translation unit, before or after the module import that contains the same
>> header contents, causing multiple definitions or other semantic issues.
>> That's why we tend to prefer the clear semantic model that modules depend
>> only in the headers that the module is comprised of, or other modules, and
>> we want to encourage people to modularize their headers.
>>
>
> First, many non-modular includes are *intended* to be repeatedly textually
> included, and your argument does not apply to those cases (consider Clang's
> .def files, for instance). Some headers are implementation details of other
> headers and are only meant to be included once, into one place; those are
> similarly unaffected.
>
>
> We already have the syntax in the module map for these; (I think) the
> syntax is "exclude <header>". This is the "escape hatch" to be explicit
> about what non-modular, not-part-of-the-module, headers the module will
> include, and subsequently it is your responsibility to make sure that
> <header> is not textually included in the translation unit as well, or if
> it is that it will not cause issues.
>
>
> Second, the problem you describe is mostly an artifact of our broken
> implementation of macro and declaration visibility across submodules of the
> same top-level module. Names in earlier-built submodule are visible in
> *all* later-built submodules, even those that don't import the earlier ones
> (and @imports in earlier submodules naming later ones have no effect). Once
> we fix that, the requirement to modularize bottom-up (and to "fix" the
> guarded typedefs in your system headers) mostly goes away -- it's generally
> fine to have the same entity declared in two different places, and we'll
> merge together those declarations if they're both visible.
>
>
> Are you saying that if I have:
>
> struct Foo {
> <stuff>
> };
>
> @import Blah
>
> and module 'Blah' has a definition for 'struct Foo' we will merge them
> together ?
>

It depends on the language.

In C, the two definitions for 'Foo' are compatible types if they have
matching bodies, so they can be used interchangeably in that case and not
otherwise. We don't yet support this, because our compatible type code
doesn't know about modules, and this doesn't come up in non-modules code.

In C++, the two definitions for 'Foo' are required to be equivalent under
the ODR, and we will merge them (and to a limited degree, diagnose them if
they differ). This already basically works.

I think we need to fix this submodule visibility problem regardless (the
> current state of play is extremely fragile to changes in the order we
> happen to build the submodules), so I'm not convinced that this is a
> long-term problem. (Nonetheless, I agree that having modularized
> dependencies is generally a good thing.)
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140401/d3a1308b/attachment.html>