[PATCH] Allow multiple modules with the same name to coexist in the module cache

Tue Apr 8 16:38:45 PDT 2014

Gentle ping. Richard, did you want to continue to review this patch in detail, or can I commit now that the high-level issues are resolved (at least for this patch)? I’m happy to evolve this patch post-commit or here in this review thread.

Ben

On Apr 4, 2014, at 4:13 PM, Ben Langmuir <blangmuir at apple.com> wrote:

> 
> On Apr 4, 2014, at 2:35 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> 
>> On Fri, Apr 4, 2014 at 10:44 AM, Ben Langmuir <blangmuir at apple.com> wrote:
>> 
>> On Apr 4, 2014, at 10:35 AM, Richard Smith <richard at metafoo.co.uk> wrote:
>> 
>>> 
>>> On 4 Apr 2014 09:09, "Ben Langmuir" <blangmuir at apple.com> wrote:
>>> >
>>> >
>>> > On Apr 3, 2014, at 7:49 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> >
>>> >> On Thu, Apr 3, 2014 at 3:40 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>>> >>>
>>> >>>
>>> >>> On Mar 28, 2014, at 2:45 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> >>>
>>> >>>> On Fri, Mar 28, 2014 at 1:12 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>>> >>>>>
>>> >>>>> This patch allows multiple modules that have the same name to coexist in the module cache.  To differentiate between two modules with the same name, we will consider the path the module map file that they are defined by* part of the ‘key’ for looking up the precompiled module (pcm file).  Specifically, this patch renames the precompiled module (pcm) files from
>>> >>>>>
>>> >>>>> cache-path/<module hash>/Foo.pcm
>>> >>>>>
>>> >>>>> to
>>> >>>>>
>>> >>>>> cache-path/<module hash>/Foo-<hash of module map path>.pcm
>>> >>>>
>>> >>>>
>>> >>>> From a high level, I don't really see why we need a second hash here. Shouldn't the -I options be included in the <module hash>? If I build the same module with different -I flags, that should resolve to different .pcm files, regardless of whether it makes the module name resolve to a different module.map file.
>>> >>>>
>>> >>>> Are you trying to cope with the case where the -I path finds multiple module.map files defining the same module (where it's basically chance which one will get built and used)? I don't really feel like this is the right solution to that problem either -- we should remove the 'luck' aspect and use some sane mechanism to determine which module.map files are loaded, and in what order.
>>> >>>>
>>> >>>> Is this addressing some other case?
>>> >>>>
>>> >>>>  
>>> >>>>>
>>> >>>>> In addition, I’ve taught the ASTReader to re-resolve the names of imported modules during module loading so that if the header search context changes between when a module was originally built and when it is loaded we can rebuild it if necessary.  For example, if module A imports module B
>>> >>>>>
>>> >>>>> first time:
>>> >>>>> clang -I /path/to/A -I /path/to/B …
>>> >>>>>
>>> >>>>> second time:
>>> >>>>> clang -I /path/to/A -I /different/path/to/B …
>>> >>>>>
>>> >>>>> will now rebuild A as expected.
>>> >>>>>
>>> >>>>>
>>> >>>>> * in the case of inferred modules, we use the module map file that *allowed* the inference, not the __inferred_module.map file, since the inferred file path is the same for every inferred module.
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Review comments on the patch itself:
>>> >>>>
>>> >>>>  +  /// the inferrence (e.g. contained 'module *') rather than the virtual
>>> >>>>
>>> >>>> Typo 'inference', 'Module *'.
>>> >>>>
>>> >>>> +  /// For an explanaition of \p ModuleMap, see Module::ModuleMap.
>>> >>>>
>>> >>>> Typo 'explanation'.
>>> >>>>
>>> >>>> +  // safe becuase the FileManager is shared between the compiler instances.
>>> >>>>
>>> >>>> Typo 'because'
>>> >>>>
>>> >>>> +  // the inferred module. If this->ModuleMap is nullptr, then we are using
>>> >>>> +  // -emit-module directly, and we cannot have an inferred module.
>>> >>>>
>>> >>>> I don't understand what this comment is trying to say. If we're using -emit-module, then we were given a module map on the command-line; should that not be referred to by this->ModuleMap? (Also, why 'this->'?) How can a top-level module be inferred? Is that a framework-specific thing?
>>> >>>>
>>> >>>> +    StringRef ModuleMap = this->ModuleMap ? this->ModuleMap->getName() : InFile;
>>> >>>>
>>> >>>> Please pick a different variable name rather than shadowing a member of '*this' here.
>>> >>>>
>>> >>>> +    // Construct the name <ModuleName>-<hash of ModuleMapPath>.pcm which should
>>> >>>> +    // be globally unique to this particular module.
>>> >>>> +    llvm::APInt Code(64, llvm::hash_value(ModuleMapPath));
>>> >>>> +    SmallString<128> HashStr;
>>> >>>> +    Code.toStringUnsigned(HashStr);
>>> >>>>
>>> >>>> Use base 36, like the module hash.
>>> >>>
>>> >>>
>>> >>> I’ve attached an updated patch.  Changes since the previous one:
>>> >>> 1. Fixed the typos and other issues Richard pointed out
>>> >>> 2. I’ve added code to canonicalize the module map path (using realpath); I was getting spurious failures on case-intensitive filesystems.
>>> >>
>>> >>
>>> >> This part is probably not OK, because it'll do the wrong thing on some build farms (where the canonical path is not predictable, but the path that make_absolute returns is, by careful control of $PWD). I'll look into this more, but will be traveling for the next couple of days.
>>> >
>>> >
>>> > Okay, I have a better idea: I already store the actual module map path in MODULE_MAP_FILE, so when that is read, we can just get a FileEntry for it and compare to what header search finds for the current module (since FileEntries are uniqued by inode).  That also means I can give a better diagnostic with the module map paths rather than the pcm filenames.
>>> 
>>> Sounds great, thanks!
>>> 
>>> 
>> 
>> Nuts, it turns out I still need a canonical path - when we hash the path into the pcm name we don’t want to get a different value depending on the case-sensitivity of the the file system.
>> 
>> Shall I just make_absolute and then realpath()? That should incorporate $PWD correctly, I think.
>> 
>> That'll resolve symlinks in $PWD -- the problem is with build farms that don't (or can't) make the file system look the same on each target, but try to fake it up with symlinks; if you resolve the symlinks, you defeat this attempt.
> 
> I don’t know how that could break anything with the pcm filenames, but sure.
> 
>> How about folding the path to the module map to lowercase before hashing it, then when we come to load the module map, check that the FileEntry found by header search for the module matches the FileEntry for the actual module map path as you were planning? We'd get (and detect and recover from) cache slot collisions if two module map files differ only by case, but that sounds pretty unlikely to me.
> 
> I can live with that.  The attached patch uses StringRef::lower() and drops the modifications to FileManager.
> 
> <modulemappath-v4.patch>
> 
>> 
>> Ben
>> 
>>> >>
>>> >>> 3. I’ve moved the initialization of the MainFileID (in source manager) from Execute() to BeginSourceFile(), since we are now potentially creating file ids for module map files during pch loading and need to be able to find the main file reliably to construct a correct include stack.
>>> >
>>> >
>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140408/d15f93df/attachment.html>