[cfe-commits] Patch for review: Parenthesis range in CXXConstructExprs

Richard Smith richard at metafoo.co.uk
Tue Jul 10 22:23:32 PDT 2012

On Tue, Jul 10, 2012 at 10:05 PM, Daniel Jasper <djasper at google.com> wrote:

> On Wed, Jul 11, 2012 at 12:33 AM, Richard Smith <richard at metafoo.co.uk>wrote:
>> On Tue, Jul 10, 2012 at 3:07 PM, Daniel Jasper <djasper at google.com>wrote:
>>> I don't think this is fully what I am trying to fix with this patch.
>>> This patch addresses the inconsistency that the CXXConstructExpr contains
>>> the correct ParenRange for Stmts like:
>>> X value(7);
>>> But an incorrect range if the constructor is called like a function,
>>> e.g.:
>>> X(7);
>> Right. My point is we don't have rules for what a correct range would be,
>> and we need such rules before we can say whether your patch is right. My
>> proposed rule means that the source range is wrong in both cases, and
>> shouldn't include the parens (and that we should drop the ParenRange from
>> CXXConstructExpr entirely). The fix for that is completely different from
>> what you're proposing :-)
>> As for printing, I don't know whether the SourceRange of the
>>> CXXConstructExpr should include the parentheses. However, for refactoring,
>>> it is definitely highly beneficial to know it (and in fact I think we'll
>>> have to go and add it to other nodes as well). So, as CXXConstructExpr
>>> already contains a member for that, we should definitely populate it as
>>> best we can.
>> That presumes that the parens are somehow logically part of the
>> CXXConstructExpr, which I think is really the relevant question here.
>> CXXConstructExpr is used in various cases where the parens are either part
>> of some other construct, or where there are no parens, so I don't think it
>> makes much sense to include the parens in the source range. I would imagine
>> that for refactoring, what's really desired is a consistent and rational
>> set of rules for what the source range of an expression means, which, as my
>> examples demonstrate, we're *really really far* from having.
>> Consider these two cases:
>> X value(7);
>> int value(7);
>> In the first case, the source range for the initializer covers 'value'
>> and the parentheses. For the second case, it covers only the '7'. The right
>> fix for the second case seems naturally to be that we should store the
>> source locations of the parentheses on the VarDecl. And that would remove
>> any need for storing them on the CXXConstructExpr in the first case.
>> Likewise in these expressions:
>> X(7)
>> int(7)
>> Here, the CXXFunctionalCastExpr already contains the source locations of
>> the parentheses. In the 'int' case, we don't store them anywhere else (and
>> nor do we need to). In the 'X' case, we include the 'X' and the '(' in the
>> source range of the CXXConstructExpr. That is inconsistent and unnecessary.
> Why do we not need the parenthesis for int(7)? Because there are no
> diagnostics that can trigger on a range including them?

Because they can be derived from the CXXFunctionalCastExpr (also, there's
no other AST node where they could reasonably go).

> I agree with everything you say, there should be a consistent way to
> determine source locations. However, we need pretty much all the source
> locations (this includes locations for each c/v qualifier, all parenthesis,
> possibly braces, ...) in order to provide proper refactoring tools. I
> gather that some of the inconsistencies have occurred because we are trying
> to save as much space as possible and thus only add locations that are
> really necessary for diagnostics. So, I see two challenges:
> 1) Make the current state more consistent. For this we might need to set
> up some basic rules and then try to fix what can be fixed.
> 2) Provide a way to access the currently unavailable source locations. For
> this, is see two approaches:
>   - Include way more source locations in the current AST, possibly guarded
> by a flag to preserve performance if not needed.
>   - Provide means to re-parse the required part of the source code.
> Thoughts?

I completely agree.

I think the biggest problem for (2) is that the AST assumes that the
location of the next token after a SourceLocation can be determined, and
currently there's no easy way to do that. I think that it's possible to
extract that information by poking through the SLocEntries in the
SourceManager, without storing any extra data and with only minimal
re-lexing (and no macro expansion), but if so, we should package that
functionality up and make it generally available.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120710/7e0c72ee/attachment.html>

More information about the cfe-commits mailing list