[cfe-dev] Module maps, __FILE__ and lazy stat'ing of headers

Richard Smith richard at metafoo.co.uk
Mon Jul 28 23:40:06 PDT 2014


On 28 Jul 2014 17:57, "Ben Langmuir" <blangmuir at apple.com> wrote:
>
>
>> On Jul 28, 2014, at 4:37 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Mon, Jul 28, 2014 at 3:10 PM, Ben Langmuir <blangmuir at apple.com>
wrote:
>>>
>>>
>>> > On Jul 28, 2014, at 4:31 AM, Manuel Klimek <klimek at google.com> wrote:
>>> >
>>> > Hi Richard,
>>> >
>>> > while working with module maps for layering checks a problem with
__FILE__ we noticed triggered a couple of questions.
>>> >
>>> > The problem is that __FILE__ uses the path of the first 'stat' call
(as that is how FileManager::getFile() works).
>>>
>>> I was thinking about this a while ago and independently of this issue I
would really like to change this behaviour at some point.  The name of a
file is a property of how you look it up, not an intrinsic part of the file
itself.
>>
>>
>> I agree. We have incorrectly conflated the notion of the file identity
(inode) with the directory entriy, and we can't tell the two apart. This
leads to weird behavior in a number of places. For instance:
>>
>>  a/
>>   x.h: #include "y.h"
>>   y.h: int a = 0;
>>  b/
>>   x.h: symlink to a/x.h
>>   y.h: int b = 0;
>>  main.c:
>>   #include "a/x.h"
>>   #include "b/x.h"
>>   int main() { return a + b; }
>>
>> On a correct compiler, this would work. For clang, it fails, because
b/x.h's #include finds a/y.h, because we use the path by which we first
found x.h as the One True Path to x.h. (This also leads to wrong __FILE__,
etc.)
>>
>> I tried fixing this ~2 years ago by splitting FileEntry into separate
dentry and inode classes, but this rapidly snowballed and exposed the same
design error being made in various other components.
>
>
> Interesting.  My motivation was keeping track of virtual and “real” paths
for the VFS, which is a special case of the above.  Maybe we can sink the
dentry/inode down to the VFS layer eventually?  Getting symlinks and “..”
entries to work in the VFS make a lot more sense when we have explicit
dentries rather than inferring from the file path.
>
> I’d be interested in what issues you ran into here if you remember.

I'll see if I can dig out my patch tomorrow.

>> We should fix this, and it seems the right way to address the problem
here, but it's a big task.
>
>
> I’d be interested in taking a crack at this, but it’s probably not
something I could focus on for a while. If someone else is eager, I’d still
be happy with pitching in to review or doing VFS bits or whatever.
>
> Ben
>
>>
>>> > Currently we parse all module maps at the beginning of the TU, and
stat all headers while doing so; there is a FIXME in
ModuleMapParser::parseHeaderDecl that reminds us this might not be a good
idea, and that we'd want to somehow lazily stat the headers anyway.
>>> > Regardless, currently the first 'stat' to a header when using
strict-decluse is usually the way the header is referenced by its module's
module map; this is often very different from what the user expects, and
certainly different from how this used to work (using -I relative path via
which a header is usually found).
>>>
>>> Just to make sure I understand the issue: the file name is correct,
it’s just showing up relative to the module map file rather than relative
to an include path?
>>>
>>> >
>>> > There are multiple possible solutions, but the first one we thought
about was to fix the header stat'ing FIXME, which would naturally solve
this problem by pushing the first stat of the header after the stat from
the #include.
>>> > The problem with that approach is:
>>> > 1. We currently have some places that look up a module for a header;
we at least need to resolve the headers of all direct dependencies of the
module map; I'm unsure what the effect of only finding the headers of the
current module and its direct dependencies would be.
>>> > 2. We need to find a place in the code at which to resolve / stat all
headers in the current module map (and its direct dependencies); this needs
to be after preprocessing (thus we'd need to delay diags related to modules
while parsing #include directives). It seems to me like this would allow us
to also resolve headers only for the current module and its direct deps, at
least if my theory is correct that preprocessing would happen once per file
per module being built (even on recursive module build invocations)
>>> >
>>> > After looking into it, it seems like this is significant effort, so
we're somewhat hoping to find an intermediate solution that is less effort
and allows us to fix the above problems together with the work on full
modules.
>>> >
>>> > Thoughts?
>>> > /Manuel
>>> > _______________________________________________
>>> > cfe-dev mailing list
>>> > cfe-dev at cs.uiuc.edu
>>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140728/d4c9c446/attachment.html>


More information about the cfe-dev mailing list