[cfe-dev] Preprocessed loc/token retrieval dream (almost) come true
Abramo Bagnara
abramo.bagnara at gmail.com
Fri Sep 30 02:09:47 PDT 2011
Ping and direct questions below.
Il 24/09/2011 17:15, Abramo Bagnara ha scritto:
>
> Clang has always missed the possibility to reconstruct the preprocessed
> token stream from a given location (without redoing the full preprocessing).
>
> Thanks to recent changes from Chandler and Argyrios I'm now able to get
> the next parsed token location in a reliable way.
>
> I attach the code I use currently for review and to check if there is
> interest to have these helpers in clang library (IMHO this service is
> *very* useful and currently badly approximated in HTMLRewrite.cpp).
There is interest on having in clang library the methods to get from a
starting location all the locations for following tokens in
preprocessing order? This would permit to know if *all* the locations in
a specific range satisfies a given property, to get the missing
locations, to scan the exact preprocessed sequence of type/storage
specifiers, etc.
> The code use show also some likely bugs in clang location storing, namely:
>
> - the SLocEntry for macro arg expansion has an extra token at end and
> this is not taken in consideration when computing isInFileID (a
> workaround for that is in the attached code)
Is this intended or it should be considered a bug?
> - immediate expansion range of stringified tokens enclose only '#' and
> not '# arg' (this implies that the helper get confused there)
Is this intended or it should be considered a bug?
> - immediate expansion range of concatenated tokens enclose only '##' and
> not 'x ## y' (this implies that the helper get confused there)
Is this inteded or it should be considered a bug?
>
> The code currently still does not take in account file changes due to
> #include, but I think this is a minor point and perhaps fixable.
>
> To do its work parser_loc_get_pp_next needs that a reverse map is loaded
> so to know which tokens are expansion point (i.e. a SourceLocation for
> each macro SLocEntry).
>
> typedef llvm::DenseMap<unsigned, clang::SrcMgr::SLocEntry> Exp_Map;
>
> Exp_Map exp_map;
>
> void load_exp_map() {
> using namespace clang;
> SourceManager& sm = get_source_manager();
> int i, last = sm.local_sloc_entry_size();
> for (i = 0; i < last; ++i) {
> FileID fid;
> // This method is private.
> // fid = FileID::get(i);
> // Ugly dirty trick is needed
> *reinterpret_cast<int*>(&fid) = i;
> SrcMgr::SLocEntry entry = sm.getSLocEntry(fid);
> if (!entry.isExpansion())
> continue;
> SourceLocation from = entry.getExpansion().getExpansionLocStart();
> exp_map[from.getRawEncoding()] = entry;
> }
> }
>
> clang::SourceLocation parser_loc_get_pp_next(clang::SourceLocation cur) {
> using namespace clang;
> SourceManager& sm = get_source_manager();
> const clang::LangOptions& lo = get_lang_options();
> assert(exp_map.find(cur.getRawEncoding()) == exp_map.end());
> SourceLocation next;
> while (1) {
> std::pair<FileID, unsigned> cur_info = sm.getDecomposedLoc(cur);
> SourceLocation scur = sm.getSpellingLoc(cur);
> std::pair<FileID, unsigned> scur_info = sm.getDecomposedLoc(scur);
> bool invalid = false;
> StringRef buf = sm.getBufferData(scur_info.first, &invalid);
> if (invalid)
> return SourceLocation();
> const char* point = buf.data() + scur_info.second;
> Lexer lexer(sm.getLocForStartOfFile(scur_info.first), lo,
> buf.begin(), point, buf.end());
> Token tok;
> lexer.LexFromRawLexer(tok);
> lexer.LexFromRawLexer(tok);
> if (tok.is(tok::eof)) {
> if (!cur.isMacroID())
> return SourceLocation();
> }
> else {
> SourceLocation snext = tok.getLocation();
> unsigned dist = sm.getFileOffset(snext) - scur_info.second;
> // Dirty trick to apply offset to macro loc
> next = SourceLocation::getFromRawEncoding(cur.getRawEncoding() +
> dist);
> // The following conditional is needed only to workaround a
> // likely bug in SourceManager::isInFileID when called with macro arg
> // expansions.
> if (sm.isMacroArgExpansion(cur)) {
> // Dirty trick to apply offset to macro loc
> if
> (sm.isInFileID(SourceLocation::getFromRawEncoding(cur.getRawEncoding() +
> dist + 1), cur_info.first))
> break;
> }
> else {
> if (sm.isInFileID(next, cur_info.first))
> break;
> }
> }
> cur = sm.getImmediateExpansionRange(cur).second;
> }
> while (1) {
> Exp_Map::iterator i = exp_map.find(next.getRawEncoding());
> if (i == exp_map.end())
> break;
> SrcMgr::SLocEntry entry = i->second;
> // This method is private.
> // next = SourceLocation::getMacroLoc(entry.getOffset());
> // Ugly dirty trick is needed
> next = SourceLocation::getFromRawEncoding(entry.getOffset() | (1 <<
> 31));
> assert(next.isMacroID());
> }
> return next;
> }
More information about the cfe-dev
mailing list