[cfe-dev] Module maps, __FILE__ and lazy stat'ing of headers

Ben Langmuir blangmuir at apple.com
Wed Aug 6 09:10:06 PDT 2014


> On Aug 6, 2014, at 2:01 AM, Manuel Klimek <klimek at google.com> wrote:
> 
> Ok, I might be dense here, but I'm still missing the picture :)
> 
> So, I see that the idea is to have FileName that basically encapsulates which path/name combination we found the file under, and a FileEntry, which maps to the inode.
> 
> Questions:
> 1. do we want to replace all uses of FileEntry with FileName, or are there cases where we're only interested in the inodes (for example in SourceManager).
> 2. how would we detect whether we are on a case insensitive file system?

We can check sensitivity lazily on a per-file basis. If a lookup misses the SeenFileEntries cache, but hits a file we already have a UniqueID for, we can then check whether we already have a FileName for that file that is the same as our lookup up to case-sensitivity.  To do that, we can either store a list of FileName pointers in the FileEntry, or we can add another map to the FileManager to look it up.  Either way, the common-case will be one FIleName per FileEntry

Ben

> Cheers,
> /Manuel
> 
> 
> 
> On Tue, Aug 5, 2014 at 9:58 PM, Ben Langmuir <blangmuir at apple.com> wrote:
> 
>> On Aug 5, 2014, at 12:28 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>> 
>> On Tue, Aug 5, 2014 at 10:57 AM, Ben Langmuir <blangmuir at apple.com> wrote:
>> 
>>> On Aug 5, 2014, at 7:51 AM, Manuel Klimek <klimek at google.com> wrote:
>>> 
>>> As this is blocking us, I'm taking a stab at it…
>> 
>> One thing that stood out to me in Richard's patch:
>> 
>> * Returning FileName pointers invites users to rely on the identity of the FileName object, which won’t work on case-insensitive file systems and is probably a bad idea in general.  I suggest we wrap FileName * in a “FileRef” object that prevents comparisons or possibly forwards them to comparing the inodes.
>> 
>> I don't agree: there are some uses where we really want to compare the canonicalized FileName pointers. For instance, this is the right behavior for #pragma once, and for various things in HeaderSearch. (When looking up a header relative to a file, it should be the FileName that is the cache key, not the FileEntry.)
>> 
>> Perhaps the best way to deal with case-insensitive file systems is to give paths that differ by case the same FileName object, since they are the same dentry (which is the notion we're trying to capture here).  
> 
> SGTM.
> 
>> Likewise, on Windows, the short and long forms of the same file name should have the same FileName object. We may be able to make that change without introducing a split between FileName and FileEntry; the downside would be that we may potentially load the same inode multiple times if it's found by different paths (relative paths being the most likely offender).
>> 
>> Ben
>> 
>> 
>>> 
>>> 
>>> On Wed, Jul 30, 2014 at 12:04 AM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> On Mon, Jul 28, 2014 at 11:40 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> 
>>> On 28 Jul 2014 17:57, "Ben Langmuir" <blangmuir at apple.com> wrote:
>>> >
>>> >
>>> >> On Jul 28, 2014, at 4:37 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> >>
>>> >> On Mon, Jul 28, 2014 at 3:10 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>>> >>>
>>> >>>
>>> >>> > On Jul 28, 2014, at 4:31 AM, Manuel Klimek <klimek at google.com> wrote:
>>> >>> >
>>> >>> > Hi Richard,
>>> >>> >
>>> >>> > while working with module maps for layering checks a problem with __FILE__ we noticed triggered a couple of questions.
>>> >>> >
>>> >>> > The problem is that __FILE__ uses the path of the first 'stat' call (as that is how FileManager::getFile() works).
>>> >>>
>>> >>> I was thinking about this a while ago and independently of this issue I would really like to change this behaviour at some point.  The name of a file is a property of how you look it up, not an intrinsic part of the file itself.
>>> >>
>>> >>
>>> >> I agree. We have incorrectly conflated the notion of the file identity (inode) with the directory entriy, and we can't tell the two apart. This leads to weird behavior in a number of places. For instance:
>>> >>
>>> >>  a/
>>> >>   x.h: #include "y.h"
>>> >>   y.h: int a = 0;
>>> >>  b/
>>> >>   x.h: symlink to a/x.h
>>> >>   y.h: int b = 0;
>>> >>  main.c:
>>> >>   #include "a/x.h"
>>> >>   #include "b/x.h"
>>> >>   int main() { return a + b; }
>>> >>
>>> >> On a correct compiler, this would work. For clang, it fails, because b/x.h's #include finds a/y.h, because we use the path by which we first found x.h as the One True Path to x.h. (This also leads to wrong __FILE__, etc.)
>>> >>
>>> >> I tried fixing this ~2 years ago by splitting FileEntry into separate dentry and inode classes, but this rapidly snowballed and exposed the same design error being made in various other components.
>>> >
>>> >
>>> > Interesting.  My motivation was keeping track of virtual and “real” paths for the VFS, which is a special case of the above.  Maybe we can sink the dentry/inode down to the VFS layer eventually?  Getting symlinks and “..” entries to work in the VFS make a lot more sense when we have explicit dentries rather than inferring from the file path.
>>> >
>>> > I’d be interested in what issues you ran into here if you remember.
>>> 
>>> I'll see if I can dig out my patch tomorrow.
>>> 
>>> Attached is my WIP patch from (as it turns out) over 2 years ago. My (very) hazy memories were that we had quite a few different places where people were using FileEntry*s for things, and neither a dentry nor an inode seemed like the "right" thing. I don't remember any more details than that. There were also places where we would need to just make a decision, such as: what should #pragma once use as its key? (I think dentry is the right answer here, since the same file found in different directories might mean different things, but that answer may break some people who use #pragma once and don't also have include guards. Conversely, it fixes some builds on content-addressed file systems.)
>>> 
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140806/bbad0dfe/attachment.html>


More information about the cfe-dev mailing list