[PATCH] Allow multiple modules with the same name to coexist in the module cache

Richard Smith richard at metafoo.co.uk
Sun Apr 13 17:10:05 PDT 2014


On Tue, Apr 8, 2014 at 4:38 PM, Ben Langmuir <blangmuir at apple.com> wrote:

> Gentle ping. Richard, did you want to continue to review this patch in
> detail, or can I commit now that the high-level issues are resolved (at
> least for this patch)? I'm happy to evolve this patch post-commit or here
> in this review thread.
>

Really sorry for the delay (EuroLLVM + catching up afterwards), LGTM.

I'm rather surprised we don't already store the MODULE_NAME information
into the .pcm file...

Ben
>
> On Apr 4, 2014, at 4:13 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>
> On Apr 4, 2014, at 2:35 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>
> On Fri, Apr 4, 2014 at 10:44 AM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>>
>> On Apr 4, 2014, at 10:35 AM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>>
>> On 4 Apr 2014 09:09, "Ben Langmuir" <blangmuir at apple.com> wrote:
>> >
>> >
>> > On Apr 3, 2014, at 7:49 PM, Richard Smith <richard at metafoo.co.uk>
>> wrote:
>> >
>> >> On Thu, Apr 3, 2014 at 3:40 PM, Ben Langmuir <blangmuir at apple.com>
>> wrote:
>> >>>
>> >>>
>> >>> On Mar 28, 2014, at 2:45 PM, Richard Smith <richard at metafoo.co.uk>
>> wrote:
>> >>>
>> >>>> On Fri, Mar 28, 2014 at 1:12 PM, Ben Langmuir <blangmuir at apple.com
>> > wrote:
>> >>>>>
>> >>>>> This patch allows multiple modules that have the same name to
>> coexist in the module cache.  To differentiate between two modules with the
>> same name, we will consider the path the module map file that they are
>> defined by* part of the 'key' for looking up the precompiled module (pcm
>> file).  Specifically, this patch renames the precompiled module (pcm) files
>> from
>> >>>>>
>> >>>>> cache-path/<module hash>/Foo.pcm
>> >>>>>
>> >>>>> to
>> >>>>>
>> >>>>> cache-path/<module hash>/Foo-<hash of module map path>.pcm
>> >>>>
>> >>>>
>> >>>> From a high level, I don't really see why we need a second hash
>> here. Shouldn't the -I options be included in the <module hash>? If I build
>> the same module with different -I flags, that should resolve to different
>> .pcm files, regardless of whether it makes the module name resolve to a
>> different module.map file.
>> >>>>
>> >>>> Are you trying to cope with the case where the -I path finds
>> multiple module.map files defining the same module (where it's basically
>> chance which one will get built and used)? I don't really feel like this is
>> the right solution to that problem either -- we should remove the 'luck'
>> aspect and use some sane mechanism to determine which module.map files are
>> loaded, and in what order.
>> >>>>
>> >>>> Is this addressing some other case?
>> >>>>
>> >>>>
>> >>>>>
>> >>>>> In addition, I've taught the ASTReader to re-resolve the names of
>> imported modules during module loading so that if the header search context
>> changes between when a module was originally built and when it is loaded we
>> can rebuild it if necessary.  For example, if module A imports module B
>> >>>>>
>> >>>>> first time:
>> >>>>> clang -I /path/to/A -I /path/to/B ...
>> >>>>>
>> >>>>> second time:
>> >>>>> clang -I /path/to/A -I /different/path/to/B ...
>> >>>>>
>> >>>>> will now rebuild A as expected.
>> >>>>>
>> >>>>>
>> >>>>> * in the case of inferred modules, we use the module map file that
>> *allowed* the inference, not the __inferred_module.map file, since the
>> inferred file path is the same for every inferred module.
>> >>>>
>> >>>>
>> >>>>
>> >>>> Review comments on the patch itself:
>> >>>>
>> >>>>  +  /// the inferrence (e.g. contained 'module *') rather than the
>> virtual
>> >>>>
>> >>>> Typo 'inference', 'Module *'.
>> >>>>
>> >>>> +  /// For an explanaition of \p ModuleMap, see Module::ModuleMap.
>> >>>>
>> >>>> Typo 'explanation'.
>> >>>>
>> >>>> +  // safe becuase the FileManager is shared between the compiler
>> instances.
>> >>>>
>> >>>> Typo 'because'
>> >>>>
>> >>>> +  // the inferred module. If this->ModuleMap is nullptr, then we
>> are using
>> >>>> +  // -emit-module directly, and we cannot have an inferred module.
>> >>>>
>> >>>> I don't understand what this comment is trying to say. If we're
>> using -emit-module, then we were given a module map on the command-line;
>> should that not be referred to by this->ModuleMap? (Also, why 'this->'?)
>> How can a top-level module be inferred? Is that a framework-specific thing?
>> >>>>
>> >>>> +    StringRef ModuleMap = this->ModuleMap ?
>> this->ModuleMap->getName() : InFile;
>> >>>>
>> >>>> Please pick a different variable name rather than shadowing a member
>> of '*this' here.
>> >>>>
>> >>>> +    // Construct the name <ModuleName>-<hash of ModuleMapPath>.pcm
>> which should
>> >>>> +    // be globally unique to this particular module.
>> >>>> +    llvm::APInt Code(64, llvm::hash_value(ModuleMapPath));
>> >>>> +    SmallString<128> HashStr;
>> >>>> +    Code.toStringUnsigned(HashStr);
>> >>>>
>> >>>> Use base 36, like the module hash.
>> >>>
>> >>>
>> >>> I've attached an updated patch.  Changes since the previous one:
>> >>> 1. Fixed the typos and other issues Richard pointed out
>> >>> 2. I've added code to canonicalize the module map path (using
>> realpath); I was getting spurious failures on case-intensitive filesystems.
>> >>
>> >>
>> >> This part is probably not OK, because it'll do the wrong thing on some
>> build farms (where the canonical path is not predictable, but the path that
>> make_absolute returns is, by careful control of $PWD). I'll look into this
>> more, but will be traveling for the next couple of days.
>> >
>> >
>> > Okay, I have a better idea: I already store the actual module map path
>> in MODULE_MAP_FILE, so when that is read, we can just get a FileEntry for
>> it and compare to what header search finds for the current module (since
>> FileEntries are uniqued by inode).  That also means I can give a better
>> diagnostic with the module map paths rather than the pcm filenames.
>>
>> Sounds great, thanks!
>>
>>
>> Nuts, it turns out I still need a canonical path - when we hash the path
>> into the pcm name we don't want to get a different value depending on the
>> case-sensitivity of the the file system.
>>
>> Shall I just make_absolute and then realpath()? That should incorporate
>> $PWD correctly, I think.
>>
>
> That'll resolve symlinks in $PWD -- the problem is with build farms that
> don't (or can't) make the file system look the same on each target, but try
> to fake it up with symlinks; if you resolve the symlinks, you defeat this
> attempt.
>
>
> I don't know how that could break anything with the pcm filenames, but
> sure.
>
> How about folding the path to the module map to lowercase before hashing
> it, then when we come to load the module map, check that the FileEntry
> found by header search for the module matches the FileEntry for the actual
> module map path as you were planning? We'd get (and detect and recover
> from) cache slot collisions if two module map files differ only by case,
> but that sounds pretty unlikely to me.
>
>
> I can live with that.  The attached patch uses StringRef::lower() and
> drops the modifications to FileManager.
>
> <modulemappath-v4.patch>
>
>
> Ben
>>
>> >>
>> >>> 3. I've moved the initialization of the MainFileID (in source
>> manager) from Execute() to BeginSourceFile(), since we are now potentially
>> creating file ids for module map files during pch loading and need to be
>> able to find the main file reliably to construct a correct include stack.
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140413/17528c90/attachment.html>


More information about the cfe-commits mailing list