[cfe-dev] Patch to allow comment translators implementation

Abramo Bagnara abramobagnara at tin.it
Tue Dec 29 14:51:01 PST 2009


Il 29/12/2009 21:08, Chris Lattner ha scritto:
> 
> On Dec 26, 2009, at 7:58 AM, Abramo Bagnara wrote:
> 
>>
>> This small patch change comments handler in a simple way to permit to
>> implement quite easily comment translators.
>>
>> Once applied this patch, a CommentHandler is allowed to build a first
>> token to be returned to Lexer and to push a TokenStream for the others,
>> then allowing generic comment -> tokens transformer.
>>
>> This can be useful to transform comment shaped program annotation that
>> should be translated to source code and also other interesting applications.
> 
> This is an interesting approach.  The only major concern I have is
> that this only allows you to translate comments into exactly one
> token.  In the case of openmp pragmas (for example) this doesn't seem
> rich enough.  A different approach would be to allow the handler to
> push an arbitrary number of tokens into the parser's lookahead
> buffer.  Would this work for what you're trying to do?

Yes, but perhaps this is not needed: as I wrote the CommentHandler could
return a first token *and* produce the other tokens to be read and push
them to lexer stack using EnterTokenStream.

I've already tried this with success in a sample implementation that
simply lex the comment content without modify it:

bool Comment_Converter::HandleComment(clang::Preprocessor &PP,
                                      clang::Token& token,
                                      clang::SourceRange Comment) {
  const clang::SourceManager &sm = PP.getSourceManager();
  const clang::LangOptions &lo = PP.getLangOptions();
  clang::SourceLocation begin = Comment.getBegin();
  clang::FileID fid = sm.getFileID(begin);
  const char* start = sm.getCharacterData(begin);
  const char* end = sm.getCharacterData(Comment.getEnd());
  if (start[1] == '*')
    end -= 2;
  start += 2;
  char saved = *end;
  *const_cast<char*>(end) = 0;
  clang::Lexer lexer(sm.getLocForStartOfFile(fid), lo,
                     sm.getBufferData(fid).first,
                     start, end);
  clang::Token tok;
  lexer.LexFromRawLexer(tok);
  if (tok.is(clang::tok::eof)) {
    *const_cast<char*>(end) = 0;
    return false;
  }
  token = tok;
  static std::vector<clang::Token> tokens;
  tokens.clear();
  while (1) {
    lexer.LexFromRawLexer(tok);
    if (tok.is(clang::tok::eof))
      break;
    if (tok.is(clang::tok::identifier))
      tok.setKind(PP.LookUpIdentifierInfo(tok)->getTokenID());
    tokens.push_back(tok);
  }
  *const_cast<char*>(end) = saved;
  if (tokens.size() > 0)
    PP.EnterTokenStream(tokens.data(), tokens.size(), false, false);
  return true;
}



More information about the cfe-dev mailing list