[cfe-commits] r163588 - in /cfe/trunk: lib/Lex/PPDirectives.cpp test/Index/c-index-getCursor-pp.c

Richard Smith richard at metafoo.co.uk
Wed Sep 19 16:02:48 PDT 2012


FWIW, the original patch caused PR13880.

On Tue, Sep 18, 2012 at 1:50 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:

> On Sep 18, 2012, at 1:35 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>
> On Tue, Sep 18, 2012 at 10:23 AM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:
>
>> On Sep 17, 2012, at 4:50 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Mon, Sep 17, 2012 at 4:04 PM, James Dennett <jdennett at google.com>wrote:
>>
>>> On Mon, Sep 17, 2012 at 3:04 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com>
>>> wrote:
>>> >
>>> > On Sep 17, 2012, at 2:38 PM, James Dennett <jdennett at googlers.com>
>>> wrote:
>>> >
>>> >> On Mon, Sep 17, 2012 at 2:22 PM, Argyrios Kyrtzidis <
>>> kyrtzidis at apple.com> wrote:
>>> >>> On Sep 17, 2012, at 12:44 PM, James Dennett <jdennett at googlers.com>
>>> wrote:
>>> >>>
>>> >>>> On Mon, Sep 10, 2012 at 7:17 PM, Argyrios Kyrtzidis <
>>> akyrtzi at gmail.com> wrote:
>>> >>>>> Author: akirtzidis
>>> >>>>> Date: Mon Sep 10 21:17:21 2012
>>> >>>>> New Revision: 163588
>>> >>>>>
>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=163588&view=rev
>>> >>>>> Log:
>>> >>>>> [libclang] Fix getting a cursor inside an angled #include
>>> directive.
>>> >>>>>
>>> >>>>> Fixed by pointing the end location of the preprocessed entity for
>>> the #include
>>> >>>>> at the closing '>', instead of the start of '<'.
>>> >>>>>
>>> >>>>> rdar://11113134
>>> >>>>>
>>> >>>>> Modified:
>>> >>>>>   cfe/trunk/lib/Lex/PPDirectives.cpp
>>> >>>>>   cfe/trunk/test/Index/c-index-getCursor-pp.c
>>> >>>>>
>>> >>>>> Modified: cfe/trunk/lib/Lex/PPDirectives.cpp
>>> >>>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/PPDirectives.cpp?rev=163588&r1=163587&r2=163588&view=diff
>>> >>>>>
>>> ==============================================================================
>>> >>>>> --- cfe/trunk/lib/Lex/PPDirectives.cpp (original)
>>> >>>>> +++ cfe/trunk/lib/Lex/PPDirectives.cpp Mon Sep 10 21:17:21 2012
>>> >>>>> @@ -1296,6 +1296,9 @@
>>> >>>>>  case tok::string_literal:
>>> >>>>>    Filename = getSpelling(FilenameTok, FilenameBuffer);
>>> >>>>>    End = FilenameTok.getLocation();
>>> >>>>> +    // For an angled include, point the end location at the
>>> closing '>'.
>>> >>>>> +    if (FilenameTok.is <http://filenametok.is/>
>>> (tok::angle_string_literal))
>>> >>>>> +      End = End.getLocWithOffset(Filename.size()-1);
>>> >>>>>    CharEnd = End.getLocWithOffset(Filename.size());
>>> >>>>>    break;
>>> >>>>>
>>> >>>>>
>>> >>>>> Modified: cfe/trunk/test/Index/c-index-getCursor-pp.c
>>> >>>>> URL:
>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Index/c-index-getCursor-pp.c?rev=163588&r1=163587&r2=163588&view=diff
>>> >>>>>
>>> ==============================================================================
>>> >>>>> --- cfe/trunk/test/Index/c-index-getCursor-pp.c (original)
>>> >>>>> +++ cfe/trunk/test/Index/c-index-getCursor-pp.c Mon Sep 10
>>> 21:17:21 2012
>>> >>>>> @@ -15,6 +15,8 @@
>>> >>>>>
>>> >>>>> const char *fname = __FILE__;
>>> >>>>>
>>> >>>>> +#include <a.h>
>>> >>>>> +
>>> >>>>> // RUN: c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck
>>> -check-prefix=CHECK-1 %s
>>> >>>>> // CHECK-1: macro definition=OBSCURE
>>> >>>>> // RUN: c-index-test -cursor-at=%s:2:14 -I%S/Inputs %s | FileCheck
>>> -check-prefix=CHECK-2 %s
>>> >>>>> @@ -31,6 +33,8 @@
>>> >>>>> // CHECK-7: macro expansion=B:12:9
>>> >>>>> // RUN: c-index-test -cursor-at=%s:16:25 -I%S/Inputs %s |
>>> FileCheck -check-prefix=CHECK-8 %s
>>> >>>>> // CHECK-8: macro expansion=__FILE__
>>> >>>>> +// RUN: c-index-test -cursor-at=%s:18:12 -I%S/Inputs %s |
>>> FileCheck -check-prefix=CHECK-9 %s
>>> >>>>> +// CHECK-9: inclusion directive=a.h
>>> >>>>>
>>> >>>>> // Same tests, but with "editing" optimizations
>>> >>>>> // RUN: env CINDEXTEST_EDITING=1 c-index-test -cursor-at=%s:1:11
>>> -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s
>>> >>>>
>>> >>>> This change broke clients that implement
>>> >>>> PPCallbacks::InclusionDirective, which now receive the location of
>>> the
>>> >>>> closing '>' rather than the location of the token that they're
>>> >>>> expecting based on the spec:
>>> >>>> /// \param EndLoc The location of the last token within the
>>> >>>> inclusion
>>> >>>> /// directive.
>>> >>>> The last (preprocessing) token is the whole of <filename>, of type
>>> >>>> tok::angle_string_literal.
>>> >>>>
>>> >>>> Could we consider reverting this change and finding an alternative
>>> way
>>> >>>> to fix libclang without breaking/changing the lower-level interface?
>>> >>>
>>> >>> Hi James,
>>> >>>
>>> >>> This is not a high-level libclang issue.
>>> >>
>>> >> Apologies, I was going by the change description "[libclang] Fix
>>> >> getting a cursor inside an angled #include directive.", and that the
>>> >> only tests I saw included were for libclang.  (That does suggests that
>>> >> we're missing tests for this in the appropriate place.)
>>> >
>>> > You're right, a preprocessing record unit test is also needed.
>>> >
>>> >>
>>> >>> The preprocessing record needs to know the '>' location and depends
>>> on the PPCallbacks::InclusionDirective callback. Previously this callback
>>> did not provide a way to reliably get at that location.
>>> >>
>>> >> I'm not sure about that -- it provided a way to get there, but it was
>>> >> necessary to skip over the filename.
>>> >
>>> > If the include filename was formed via a macro expansion then you had
>>> an EndLoc pointing at '>'. So this was the situation:
>>> >
>>> > No macro expansions involved: EndLoc points at '<'
>>> > With macro expansion: EndLoc points at '>'.
>>> >
>>> > The receiver had no easy way to know which case it was. With the
>>> change it always pointed at '>'.
>>>
>>> Checking which character is at the EndLoc isn't so hard, but I take your
>>> point.
>>>
>>> >> With your change a similar trick
>>> >> is required for code wanting to get to the filename, but only for
>>> >> #include <filename>, not for #include "filename".  (Why are the two
>>> >> cases different?  Currently the callback returns the location of the
>>> >> opening quote for the #include "name" case, and the closing one for
>>> >> #include <name>.  The asymmetry seems strange.)
>>> >
>>> > There's an asymmetry due to how lexing would behave if you pointed at
>>> the beginning of the filename, for example:
>>> >
>>> > #include "filename"
>>> > If you point at the opening quote and start lexing you will get the
>>> tok::string_literal token, the same that the preprocessor saw. Then you can
>>> get at the character range of the whole filename input by checking the
>>> token size.
>>> >
>>> > #include <filename>
>>> > if you point at the opening '<' and start lexing you will get a
>>> tok::less token, you need to get into the special "parsing include
>>> directive" mode to receive a tok::angle_string_literal token.
>>> > So just pointing at '<' will not allow you to easily lex and get the
>>> character range of the filename input.
>>>
>>> I was assuming that you'd lex correctly, which means being in the
>>> right mode to lex an include directive in this case.  I don't think it
>>> matters that if you lex incorrectly that you'll get the wrong results.
>>>
>>> >>> I believe we still need to modify the bahavior of the
>>> InclusionDirective callback and focus the discussion there.
>>> >>>
>>> >>> How about reverting the EndLoc adjustment but have the
>>> InclusionDirective also pass the FilenameTok token so that receivers of the
>>> callback can have more information for the inclusion directive (then the
>>> preprocessing record can do the adjustment itself).
>>> >>
>>> >> That sounds good to me -- though I'm not sure if we need to pass
>>> >> EndLoc if we pass the FilenameTok (possibly for cases where the
>>> >> filename comes from macro expansion? I'm not all that familiar with
>>> >> this code).
>>> >
>>> > Yes, the EndLoc is necessary because if there are macro expansions
>>> involved then the FilenameTok token will be a tok::less for '<'.
>>>
>>> This is going to be "fun" to document thoroughly enough for clients to
>>> be written based on the spec alone -- but then the quirky lexing rules
>>> for this context are what they are, and Clang can't (and shouldn't)
>>> completely hide them.
>>
>>
>> In terms of getting something which is easy to explain, how about we pass
>> FilenameTok, as described above, and we also pass the location of the
>> terminating > or " *character* as EndLoc? Would that cover all the use
>> cases?
>>
>>
>> Not sure about pointing at the character, pointing at '>' is pointing at
>> something that can be lexed raw as a token, while pointing at the ending
>> quote is not.
>>
>
> You can't lex the ending character as a token in the angle_string_literal
> case either. For instance, we accept (with a warning):
>
> #include <iostream>>
>
> If you try to lex the '>' as a token, you'll get '>>'. But I don't see why
> it would be useful to lex from that position, so I don't think this matters.
>
>
> This is what will happen if you want to convert a token range for the
> filename or directive to a character range; Lexer::getLocForEndOfToken will
> do a raw lex, it won't lex in the context of an #include directive.
>
>
>
>> How about a minor change to your suggestion and pass the end location of
>> the directive character range (just after the > or "); this is already
>> calculated as the CharEnd variable.
>>
>
> Yes, I think that's an improvement. I had incorrectly recalled that a
> CharRange included both the start and end character. Since it actually
> excludes its end character (unlike a TokenRange!), this seems like the more
> useful option. On reflection, I think my preferred option would be to ditch
> FilenameTok and pass a CharSourceRange for the included entity to the
> InclusionCallback.
>
>
> Yes, passing a character range will help alleviate this ugliness. I'll
> modify it like this if there are no objections.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120919/f00e6f2a/attachment.html>


More information about the cfe-commits mailing list