[cfe-dev] Module maps, __FILE__ and lazy stat'ing of headers
Manuel Klimek
klimek at google.com
Thu Aug 7 06:46:29 PDT 2014
On Thu, Aug 7, 2014 at 1:11 PM, Manuel Klimek <klimek at google.com> wrote:
> Richard came up with the idea of updating the FileEntry's name on each
> getFile, so we'd get the last value instead of the first (which would solve
> the module map related problem).
> Unfortunately this breaks:
> http://llvm.org/bugs/show_bug.cgi?id=8974
>
I've got what I think is a "real fix" for the regression. With that fix,
just resetting .Name on each stat works. Will send out code review for the
regression fix in a moment.
>
>
>
>
> On Wed, Aug 6, 2014 at 6:10 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>>
>> On Aug 6, 2014, at 2:01 AM, Manuel Klimek <klimek at google.com> wrote:
>>
>> Ok, I might be dense here, but I'm still missing the picture :)
>>
>> So, I see that the idea is to have FileName that basically encapsulates
>> which path/name combination we found the file under, and a FileEntry, which
>> maps to the inode.
>>
>> Questions:
>> 1. do we want to replace all uses of FileEntry with FileName, or are
>> there cases where we're only interested in the inodes (for example in
>> SourceManager).
>>
>> 2. how would we detect whether we are on a case insensitive file system?
>>
>>
>> We can check sensitivity lazily on a per-file basis. If a lookup misses
>> the SeenFileEntries cache, but hits a file we already have a UniqueID for,
>> we can then check whether we already have a FileName for that file that is
>> the same as our lookup up to case-sensitivity. To do that, we can either
>> store a list of FileName pointers in the FileEntry, or we can add another
>> map to the FileManager to look it up. Either way, the common-case will be
>> one FIleName per FileEntry
>>
>> Ben
>>
>> Cheers,
>> /Manuel
>>
>>
>>
>> On Tue, Aug 5, 2014 at 9:58 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>>
>>>
>>> On Aug 5, 2014, at 12:28 PM, Richard Smith <richard at metafoo.co.uk>
>>> wrote:
>>>
>>> On Tue, Aug 5, 2014 at 10:57 AM, Ben Langmuir <blangmuir at apple.com>
>>> wrote:
>>>
>>>>
>>>> On Aug 5, 2014, at 7:51 AM, Manuel Klimek <klimek at google.com> wrote:
>>>>
>>>> As this is blocking us, I'm taking a stab at it…
>>>>
>>>>
>>>> One thing that stood out to me in Richard's patch:
>>>>
>>>> * Returning FileName pointers invites users to rely on the identity of
>>>> the FileName object, which won’t work on case-insensitive file systems and
>>>> is probably a bad idea in general. I suggest we wrap FileName * in a
>>>> “FileRef” object that prevents comparisons or possibly forwards them to
>>>> comparing the inodes.
>>>>
>>>
>>> I don't agree: there are some uses where we really want to compare the
>>> canonicalized FileName pointers. For instance, this is the right behavior
>>> for #pragma once, and for various things in HeaderSearch. (When looking up
>>> a header relative to a file, it should be the FileName that is the cache
>>> key, not the FileEntry.)
>>>
>>>
>>> Perhaps the best way to deal with case-insensitive file systems is to
>>> give paths that differ by case the same FileName object, since they are the
>>> same dentry (which is the notion we're trying to capture here).
>>>
>>>
>>> SGTM.
>>>
>>> Likewise, on Windows, the short and long forms of the same file name
>>> should have the same FileName object. We may be able to make that change
>>> without introducing a split between FileName and FileEntry; the downside
>>> would be that we may potentially load the same inode multiple times if it's
>>> found by different paths (relative paths being the most likely offender).
>>>
>>> Ben
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 30, 2014 at 12:04 AM, Richard Smith <richard at metafoo.co.uk>
>>>> wrote:
>>>>
>>>>> On Mon, Jul 28, 2014 at 11:40 PM, Richard Smith <richard at metafoo.co.uk
>>>>> > wrote:
>>>>>
>>>>>>
>>>>>> On 28 Jul 2014 17:57, "Ben Langmuir" <blangmuir at apple.com> wrote:
>>>>>> >
>>>>>> >
>>>>>> >> On Jul 28, 2014, at 4:37 PM, Richard Smith <richard at metafoo.co.uk>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> On Mon, Jul 28, 2014 at 3:10 PM, Ben Langmuir <blangmuir at apple.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> > On Jul 28, 2014, at 4:31 AM, Manuel Klimek <klimek at google.com>
>>>>>> wrote:
>>>>>> >>> >
>>>>>> >>> > Hi Richard,
>>>>>> >>> >
>>>>>> >>> > while working with module maps for layering checks a problem
>>>>>> with __FILE__ we noticed triggered a couple of questions.
>>>>>> >>> >
>>>>>> >>> > The problem is that __FILE__ uses the path of the first 'stat'
>>>>>> call (as that is how FileManager::getFile() works).
>>>>>> >>>
>>>>>> >>> I was thinking about this a while ago and independently of this
>>>>>> issue I would really like to change this behaviour at some point. The name
>>>>>> of a file is a property of how you look it up, not an intrinsic part of the
>>>>>> file itself.
>>>>>> >>
>>>>>> >>
>>>>>> >> I agree. We have incorrectly conflated the notion of the file
>>>>>> identity (inode) with the directory entriy, and we can't tell the two
>>>>>> apart. This leads to weird behavior in a number of places. For instance:
>>>>>> >>
>>>>>> >> a/
>>>>>> >> x.h: #include "y.h"
>>>>>> >> y.h: int a = 0;
>>>>>> >> b/
>>>>>> >> x.h: symlink to a/x.h
>>>>>> >> y.h: int b = 0;
>>>>>> >> main.c:
>>>>>> >> #include "a/x.h"
>>>>>> >> #include "b/x.h"
>>>>>> >> int main() { return a + b; }
>>>>>> >>
>>>>>> >> On a correct compiler, this would work. For clang, it fails,
>>>>>> because b/x.h's #include finds a/y.h, because we use the path by which we
>>>>>> first found x.h as the One True Path to x.h. (This also leads to wrong
>>>>>> __FILE__, etc.)
>>>>>> >>
>>>>>> >> I tried fixing this ~2 years ago by splitting FileEntry into
>>>>>> separate dentry and inode classes, but this rapidly snowballed and exposed
>>>>>> the same design error being made in various other components.
>>>>>> >
>>>>>> >
>>>>>> > Interesting. My motivation was keeping track of virtual and “real”
>>>>>> paths for the VFS, which is a special case of the above. Maybe we can sink
>>>>>> the dentry/inode down to the VFS layer eventually? Getting symlinks and
>>>>>> “..” entries to work in the VFS make a lot more sense when we have explicit
>>>>>> dentries rather than inferring from the file path.
>>>>>> >
>>>>>> > I’d be interested in what issues you ran into here if you remember.
>>>>>>
>>>>>> I'll see if I can dig out my patch tomorrow.
>>>>>>
>>>>> Attached is my WIP patch from (as it turns out) over 2 years ago. My
>>>>> (very) hazy memories were that we had quite a few different places where
>>>>> people were using FileEntry*s for things, and neither a dentry nor an inode
>>>>> seemed like the "right" thing. I don't remember any more details than that.
>>>>> There were also places where we would need to just make a decision, such
>>>>> as: what should #pragma once use as its key? (I think dentry is the right
>>>>> answer here, since the same file found in different directories might mean
>>>>> different things, but that answer may break some people who use #pragma
>>>>> once and don't also have include guards. Conversely, it fixes some builds
>>>>> on content-addressed file systems.)
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140807/2cf53784/attachment.html>
More information about the cfe-dev
mailing list