[cfe-dev] RFC: Supporting private module maps for non-framework headers

Ben Langmuir blangmuir at apple.com
Tue Dec 2 13:34:26 PST 2014


> On Nov 24, 2014, at 2:14 PM, Vassil Vassilev <vvasilev at cern.ch> wrote:
> 
> On 14/11/14 18:57, Ben Langmuir wrote:
>> 
>>> On Nov 13, 2014, at 12:36 PM, Vassil Vassilev <vvasilev at cern.ch <mailto:vvasilev at cern.ch>> wrote:
>>> 
>>> On 12/11/14 23:37, Richard Smith wrote:
>>>> On Wed, Nov 12, 2014 at 12:48 PM, Vassil Vassilev <vvasilev at cern.ch <mailto:vvasilev at cern.ch>> wrote:
>>>> On 11/11/14 04:48, Richard Smith wrote:
>>>>> On Mon, Nov 10, 2014 at 4:00 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com <mailto:kyrtzidis at apple.com>> wrote:
>>>>> Hi all,
>>>>> 
>>>>> For frameworks Clang currently supports adding a separate module map file for the private headers of the framework. It looks specifically for the presence of ‘module.private.modulemap’ inside the .framework and parses both the public and the private module maps when it processes its module. We would like to extend support for private module maps for non-framework headers as well. 
>>>>> 
>>>>> In the Darwin platform, the public SDK headers are located in '/usr/include', while the associated private SDK headers are located in '/usr/local/include’. '/usr/local/include’ comes before '/usr/include’ in the header search paths.
>>>>> 
>>>>> I worry that this will be fragile. If for any reason we look in /usr/include but not in /usr/local/include, we'll not load the private extension map and things will probably go quite badly from that point onwards. If the presence of the /usr/local/include headers is a fundamental part of a /usr/include module, then it seems better to me to specify that within the /usr/include module map.
>>>>> 
>>>>> So here's one possibility: allow 'extern module' declarations to be nested within other modules, then write your /usr/include module map as:
>>>>> 
>>>>> module MyModule {
>>>>>   <...>
>>>>>   extern module SomethingPrivate "/usr/local/include/module.private.map"
>>>>> }
>>>> Maybe off topic (sorry if I misunderstood): would that 'somehow' allow placing a modulemap outside the /usr folder? (For cases like gcc's libstdc++).
>>>> 
>>>> There are a few related problems with this. One is that we need to be able to map from a #included file's name to the module map file, if we're loading that module map lazily. Another is that files named in a module map file are found relative to that flie.
>>>> 
>>>> We can solve the first problem with -fmodule-map-file=<libstdc++ module map>. For the second half, I've been discussing with a few people the idea of allowing a module map file to specify a "module root" directory relative to which its files are found, which need not be the directory in which the map is placed. (This also helps with another problem: diagnostics when building or using a module point to files relative to the module map file, which can result in some rather contorted and unnatural paths.)
>> 
> Sorry for the delay... :(

Heh, my turn.

>> I guess the minimum viable solution would be something like -fmodule-map-file-with-root=<module map file>,/path/to/module/root.  This has the advantage that your module map file doesn’t tie itself to a particular directory.
> Yes that would work for me. This would be better than my original proposal because the build system can expand the path to the module's root, giving some extra flexibility.
>> 
>>> Thanks for the pointers. This makes sense. Would I be able to to specify in the a framework's directory modulemaps for external dependency. In my particular case, I'd like to be able to express that this is the modulemaps for the external dependency. 
>>> 
>>> I was thinking what if we could accommodate more than one modulemap per file. Say:
>>> cat module.modulemap:
>>> modulemap Map1 {
>>>   module M1{}
>>>   ...
>>> }
>>> modulemap Map2 {
>>>   modulemap_root /usr/include // Will use the virtual file system pretending the modulemap was found at the modulemap_root
>>>   module N1{}
>>>   ...
>>> }
>> 
>> In this scheme, I would make your hypothetical root declaration part of the module, not the module map file.  i.e
>> 
>> module M1 { }
>> module N1 {
>>   module_root “/usr/include”
>>>> }
>> 
>> That might result in some duplication if you wanted to describe a lot of modules in a single directory from an external location, but it seems cleaner to me than a modulemap { } syntax.
> Yes, this duplication is what I was trying to avoid. With the modulemap {} syntax I wanted to argue for a way of specifying more than one modulemap per file.
> The idea is to provide more than one 'view' of the headers describing the libraries and exposing its interfaces.  ( See below )
>> 
>>> 
>>> IMO this would allow the 'external dependencies' to be organized in different configurations. For example, a module per header of bunch of headers for module, whichever decides the framework fits best. For our use-cases that would be great. Maybe this could simplify also the cross referencing modules and visibility also... 
>> 
>> Not sure what you mean here.  Would you mind expanding on this a bit?
> For our case, where we interpret C++, we have the mapping between header files -> .so files. Whenever the user dlopen-s a shared library (at runtime), we load a set of header files corresponding to this library, such that the user can call library functions, for example. We want to replace the headers and our own concept of modulemaps with clang's ones. I.e instead of header files we will use C++ modules and the mapping will happen by using modulemaps. I call that 'runtime' modules.
> 
> The library writer can decide to combine libA.so and libB.so into libC.so, providing a module for it. Even worse (but rare) he can decide to hide some stuff from libB.so (i.e things that he considers private from libC standpoint) and libC should not provide a module/header for them. With the current implementation of the modulemaps would that be possible?

The module for C can choose not to export everything it imports.  That happens at the module level though, not the headers.

module A { … }
module B { module X { … } module Y { … } }
module C {
   …
   export A 
   export B.Y
}

In this example, C will export A and B.Y, but not B.X.


> In a way I though it is similar to private module maps... I hope it is a bit clearer. I do realize that current implementation is a bit rough and it will mature with time and I thought it is a good time to throw our use-case in when talking about private module maps.
> 
> Vassil
>> 
>>>> 
>>>> Vassil
>>>>> 
>>>>> (in addition to the other changes you suggest here). Then only allow a module to be extended if the extension is listed via an 'extern module' in the definition of the module.
>>>>> 
>>>>> We propose to make the following changes to Clang’s module mechanism:
>>>>> 
>>>>> - When looking up a module through the search paths, in addition to ‘module.modulemap’ also lookup for a standalone ‘module.private.modulemap’ file. I will refer to this as the "private extension" module map.
>>>>> - When parsing a private extension map allow extending a module that was not defined before, without providing the full definition. To clarify, I refer to a module definition as this:
>>>>> 
>>>>> module MyModule {
>>>>>  <…>
>>>>> }
>>>>> 
>>>>> while an extension is this:
>>>>> 
>>>>> module MyModule.SomethingPrivate {
>>>>>  <…>
>>>>> }
>>>>> 
>>>>> An extension is a nested module with any depth.
>>>>> We can reuse the “extern module” syntax to indicate that we are extending a module whose definition is in a different module map:
>>>>> 
>>>>> extern module MyModule
>>>>> module MyModule.SomethingPrivate {
>>>>>  <…>
>>>>> }
>>>>> 
>>>>> - After parsing the private extension map, we are still missing the module definition so module lookup will continue looking in the following header search paths. If the module we are looking for is not found then Clang will a emit a “module not found” error.
>>>>> 
>>>>> - It may seem backwards that module search will find and parse the private extension ahead of the public one, but it is actually advantageous because this allows us to continue searching only until we find the module definition, at which point we will stop looking. If module search worked the other way then, after we had the module definition, we would need to always keep looking through the rest of the search paths in case there is a private extension map that we need to take into account, or treat certain paths specially and only look for private extensions in those.
>>>>> By finding the extension map early on, we keep the current semantics of doing the minimal search necessary to find and complete the module definition, without treating any particular search path specially.
>>>>> 
>>>>> - After Clang finds and parses the public module map for ‘MyModule’, the module definition will be complete. Clang will keep track that there is a private extension map associated with the module and it will pass the paths of both the public module map and the private extension one to the module building invocation. This will result in one module file containing both the public and private APIs, similar to what we do with frameworks.
>>>>> 
>>>>> - A module definition inside a private extension will be disallowed. The rationale is that otherwise it will be a very common mistake for users to write
>>>>> 
>>>>> module.modulemap:
>>>>> module Foo {
>>>>>   <public headers>
>>>>> }
>>>>> 
>>>>> module.private.modulemap:
>>>>> module Foo {
>>>>>   <private headers>
>>>>> }
>>>>> 
>>>>> and then be left scratching their heads wondering why things are broken (things missing, headers included textually, etc.). Being more strict in private extension maps will be beneficial.
>>>>> 
>>>>> 
>>>>> Let me know what you think!
>>>>> 
>>>>> 
>>>>> 
>>>>>  _______________________________________________
>>>>> cfe-dev mailing list
>>>>> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev <http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>
>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev <http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>
>> 
> 
> 
> -- 
> --------------------------------------------
> Q: Why is this email five sentences or less?
> A: http://five.sentenc.es <http://five.sentenc.es/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141202/b19a7cd2/attachment.html>


More information about the cfe-dev mailing list