[cfe-dev] Code Ranges of Tokens & AST Elements

Wed Aug 25 05:24:12 PDT 2010

Hello Clang Developers,

working on a project that relies on source ranges printed out by the
clang -ast-print-xml option I stumbled over the following problem:

A statement

  x = y + SomeVariable;

yields

  x = y + SomeVariable;
  ~~~~~~~~~

as range for the assignment.

The problem pertains to literals and VarDeclRefs -- probably all
objects that obtain their ranges directly from tokens.
Erroneous ranges then propagete all the way up to, say, the assignment
operator like the one mentioned above.

In general, the end locations of ranges seem to be a neglected child
in Clang (as it seems to be in most compiler frontends). Quite often,
SourceRange gets converted to SourceLocation and vice versa loosing
the EndLocation or generationg SourceRanges with no span,
respectively.

1.) In my opinion the root of all range-evilness of Clang is the constructor
      SourceRange(SourceLocation loc) : B(loc), E(loc) {} (SourceLocation.h)
    which should be removed because locations and ranges are
conceptually different things.

2.) What about Token featuring a ::getRange() instead of or in
addition to a ::getLocation() method.
While fixing a couple of cases that matter to me most, a ::getRange()
has proven handy.

3.) Would the correct usage of ranges instead of locations enlarge the
memory footprint inacceptably?

I think code ranges need a thorough overhaul which involves touching
lots of source code. Is there any plans to get that fixed? What
priority is that?

Greetings,
Phil