[cfe-commits] r163588 - in /cfe/trunk: lib/Lex/PPDirectives.cpp test/Index/c-index-getCursor-pp.c

James Dennett jdennett at google.com
Mon Sep 17 16:04:16 PDT 2012


On Mon, Sep 17, 2012 at 3:04 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:
>
> On Sep 17, 2012, at 2:38 PM, James Dennett <jdennett at googlers.com> wrote:
>
>> On Mon, Sep 17, 2012 at 2:22 PM, Argyrios Kyrtzidis <kyrtzidis at apple.com> wrote:
>>> On Sep 17, 2012, at 12:44 PM, James Dennett <jdennett at googlers.com> wrote:
>>>
>>>> On Mon, Sep 10, 2012 at 7:17 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>>> Author: akirtzidis
>>>>> Date: Mon Sep 10 21:17:21 2012
>>>>> New Revision: 163588
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=163588&view=rev
>>>>> Log:
>>>>> [libclang] Fix getting a cursor inside an angled #include directive.
>>>>>
>>>>> Fixed by pointing the end location of the preprocessed entity for the #include
>>>>> at the closing '>', instead of the start of '<'.
>>>>>
>>>>> rdar://11113134
>>>>>
>>>>> Modified:
>>>>>   cfe/trunk/lib/Lex/PPDirectives.cpp
>>>>>   cfe/trunk/test/Index/c-index-getCursor-pp.c
>>>>>
>>>>> Modified: cfe/trunk/lib/Lex/PPDirectives.cpp
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/PPDirectives.cpp?rev=163588&r1=163587&r2=163588&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/lib/Lex/PPDirectives.cpp (original)
>>>>> +++ cfe/trunk/lib/Lex/PPDirectives.cpp Mon Sep 10 21:17:21 2012
>>>>> @@ -1296,6 +1296,9 @@
>>>>>  case tok::string_literal:
>>>>>    Filename = getSpelling(FilenameTok, FilenameBuffer);
>>>>>    End = FilenameTok.getLocation();
>>>>> +    // For an angled include, point the end location at the closing '>'.
>>>>> +    if (FilenameTok.is(tok::angle_string_literal))
>>>>> +      End = End.getLocWithOffset(Filename.size()-1);
>>>>>    CharEnd = End.getLocWithOffset(Filename.size());
>>>>>    break;
>>>>>
>>>>>
>>>>> Modified: cfe/trunk/test/Index/c-index-getCursor-pp.c
>>>>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Index/c-index-getCursor-pp.c?rev=163588&r1=163587&r2=163588&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/Index/c-index-getCursor-pp.c (original)
>>>>> +++ cfe/trunk/test/Index/c-index-getCursor-pp.c Mon Sep 10 21:17:21 2012
>>>>> @@ -15,6 +15,8 @@
>>>>>
>>>>> const char *fname = __FILE__;
>>>>>
>>>>> +#include <a.h>
>>>>> +
>>>>> // RUN: c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s
>>>>> // CHECK-1: macro definition=OBSCURE
>>>>> // RUN: c-index-test -cursor-at=%s:2:14 -I%S/Inputs %s | FileCheck -check-prefix=CHECK-2 %s
>>>>> @@ -31,6 +33,8 @@
>>>>> // CHECK-7: macro expansion=B:12:9
>>>>> // RUN: c-index-test -cursor-at=%s:16:25 -I%S/Inputs %s | FileCheck -check-prefix=CHECK-8 %s
>>>>> // CHECK-8: macro expansion=__FILE__
>>>>> +// RUN: c-index-test -cursor-at=%s:18:12 -I%S/Inputs %s | FileCheck -check-prefix=CHECK-9 %s
>>>>> +// CHECK-9: inclusion directive=a.h
>>>>>
>>>>> // Same tests, but with "editing" optimizations
>>>>> // RUN: env CINDEXTEST_EDITING=1 c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s
>>>>
>>>> This change broke clients that implement
>>>> PPCallbacks::InclusionDirective, which now receive the location of the
>>>> closing '>' rather than the location of the token that they're
>>>> expecting based on the spec:
>>>> /// \param EndLoc The location of the last token within the
>>>> inclusion
>>>> /// directive.
>>>> The last (preprocessing) token is the whole of <filename>, of type
>>>> tok::angle_string_literal.
>>>>
>>>> Could we consider reverting this change and finding an alternative way
>>>> to fix libclang without breaking/changing the lower-level interface?
>>>
>>> Hi James,
>>>
>>> This is not a high-level libclang issue.
>>
>> Apologies, I was going by the change description "[libclang] Fix
>> getting a cursor inside an angled #include directive.", and that the
>> only tests I saw included were for libclang.  (That does suggests that
>> we're missing tests for this in the appropriate place.)
>
> You're right, a preprocessing record unit test is also needed.
>
>>
>>> The preprocessing record needs to know the '>' location and depends on the PPCallbacks::InclusionDirective callback. Previously this callback did not provide a way to reliably get at that location.
>>
>> I'm not sure about that -- it provided a way to get there, but it was
>> necessary to skip over the filename.
>
> If the include filename was formed via a macro expansion then you had an EndLoc pointing at '>'. So this was the situation:
>
> No macro expansions involved: EndLoc points at '<'
> With macro expansion: EndLoc points at '>'.
>
> The receiver had no easy way to know which case it was. With the change it always pointed at '>'.

Checking which character is at the EndLoc isn't so hard, but I take your point.

>> With your change a similar trick
>> is required for code wanting to get to the filename, but only for
>> #include <filename>, not for #include "filename".  (Why are the two
>> cases different?  Currently the callback returns the location of the
>> opening quote for the #include "name" case, and the closing one for
>> #include <name>.  The asymmetry seems strange.)
>
> There's an asymmetry due to how lexing would behave if you pointed at the beginning of the filename, for example:
>
> #include "filename"
> If you point at the opening quote and start lexing you will get the tok::string_literal token, the same that the preprocessor saw. Then you can get at the character range of the whole filename input by checking the token size.
>
> #include <filename>
> if you point at the opening '<' and start lexing you will get a tok::less token, you need to get into the special "parsing include directive" mode to receive a tok::angle_string_literal token.
> So just pointing at '<' will not allow you to easily lex and get the character range of the filename input.

I was assuming that you'd lex correctly, which means being in the
right mode to lex an include directive in this case.  I don't think it
matters that if you lex incorrectly that you'll get the wrong results.

>>> I believe we still need to modify the bahavior of the InclusionDirective callback and focus the discussion there.
>>>
>>> How about reverting the EndLoc adjustment but have the InclusionDirective also pass the FilenameTok token so that receivers of the callback can have more information for the inclusion directive (then the preprocessing record can do the adjustment itself).
>>
>> That sounds good to me -- though I'm not sure if we need to pass
>> EndLoc if we pass the FilenameTok (possibly for cases where the
>> filename comes from macro expansion? I'm not all that familiar with
>> this code).
>
> Yes, the EndLoc is necessary because if there are macro expansions involved then the FilenameTok token will be a tok::less for '<'.

This is going to be "fun" to document thoroughly enough for clients to
be written based on the spec alone -- but then the quirky lexing rules
for this context are what they are, and Clang can't (and shouldn't)
completely hide them.

-- James




More information about the cfe-commits mailing list