[cfe-dev] Adding more info to builtin source types.
Douglas Gregor
dgregor at apple.com
Mon Jan 11 16:10:42 PST 2010
Hello Enea,
On Jan 11, 2010, at 1:06 PM, Enea Zaffanella wrote:
> Currently, class BuiltinTypeLoc provides a single source location,
> accessible via member getNameLoc(), pointing to the name of the type
> specifier type (e.g., for 'unsigned int' it would return the
> location of 'int'). It is quite common for coders to declare builtin
> integral types using sign or width type specifiers only (e.g.,
> 'unsigned', 'long', 'unsigned short', etc.). In these cases the
> getNameLoc() method would return an invalid source location.
What specific problem do you need to address? Do you just want
getNameLoc() to have a valid location (which points to the sign or
width if there is no actual type named), or do you want fine-grained
information about where the width and sign keywords were actually
located? The former could be done with a simple change where we build
BuiltinTypeLocs (without increasing memory usage), while the latter
requires more work (as you've done in this patch).
> The attached patch augments BuiltinTypeLoc class by adding two
> source locations, one for the sign specifier and the other for the
> width specifier. We have added methods to get/set the three source
> locations.
> We also modified method getNameLoc() so as to return what is
> _likely_ to be the source location of the first type specifier token
> in the source code, assuming that the three specifiers are provided
> in the "canonical" order:
> <sign> <width> <type>
>
> Please let us know if the patch is OK.
I have a few comments on this patch.
My biggest concern is that we're increasing the amount of storage
needed for very common cases, and I'd like us to both try to minimize
the amount of storage needed and to measure the actual cost of this
change for some non-trivial header, to figure out how much extra
memory its using in practice.
To minimize the amount of storage needed, we don't always want
BuiltinTypeLoc to store 3 locations, because only the integral types
actually need to store sign and width information, and that's only
written in the source code occasionally. So, BuiltinTypeLoc should
have a variable-length encoding dependent on the actual builtin type:
types that always only have one source location (void, bool, float,
double, nullptr, id, Class, SEL) should only store that location. For
types that may have more than one location (e.g., the integral types),
the BuiltinTypeLoc could encode which specifiers exist (in some prefix
bytes) and then only have source locations for those keywords that
actually show up in the source.
+ SourceRange getSourceRange() const {
+ SourceLocation loc = getNameLoc();
+ return SourceRange(loc, loc);
+ }
If we're going to keep the source locations for the type/width/sign,
it'd be nice to the source range to cover the entire type. Storing the
specifiers in order (with some tag saying which source location refers
to which kind of keyword) would make this possible.
Of course, depending on how you answered the question at the top of my
e-mail, all of this might be moot: if you just want
BuiltinTypeLoc::getNameLoc() to point to some valid location for a
integral type described only by its width or sign, that's easy to fix
without any performance impact.
- Doug
More information about the cfe-dev
mailing list