[cfe-dev] Reading contents of Comments or any other Token

Larry Olson loarabia at hotmail.com
Wed Dec 21 20:54:41 PST 2011


Thank you very much. 
I looked at the implementation of getDecomposedLoc to make sure I understoodwhat I was getting and then made changes to my sample based on what I saw. Seems to work beautifully and simplifies the code.
(Its funny -- I think I passed through that very code earlier today while tracking getOffset butoverlooked its impact).
Did anything else in my usage of clang seem off?
My Thanks again,Larry Olson

________________________________
> Subject: Re: [cfe-dev] Reading contents of Comments or any other Token 
> From: kyrtzidis at apple.com 
> Date: Wed, 21 Dec 2011 17:37:05 -0800 
> CC: cfe-dev at cs.uiuc.edu 
> To: loarabia at hotmail.com 
>  
> On Dec 21, 2011, at 3:57 PM, Larry Olson wrote: 
>  
> Hi all, 
>  
> I love that clang and llvm make it possible for folks like me to create  
> tools that have a deeper understanding of C, C++, and ObjC. For  
> instance, I used clang to write a tool that helped me normalize headers  
> when porting code across platforms. Clang and llvm were easy to get  
> into and well documented. Thanks for the work you all do. 
>  
> I have a question for you though. 
>  
> Brief Summary of the rest of the email: 
> 2 Questions 
> 1. Is there an intended way of getting the contents of a token from  
> just the SourceRange? 
> 2. Should I be asking the SourceManager for a Buffer for a given FileID  
> and then adjusting pointers into that buffer bast on the Offset of that  
> FileID? 
>  
>  
> More details 
> I was considering what it would take to build a doxygen like tool using  
> Clang and found the CommentHandler object and its virtual,  
> HandleComments( Preprocessor, SourceRange). However, I was a bit stuck  
> when I tried to go from the SourceRange to the actual contents of the  
> comment. 
>  
> I looked at what PPCallback and ASTConsumers offer but didn't see  
> anything that would lead me to believe I should've expected more data  
> from the CommentHandler. I looked at old code I had written  that used  
> the Rewriter but that didn't feel right because I don't want to Rewrite  
> the comment, I want to parse its contents. So then I started looking  
> around frontend actions that spit out data or modify the guts of a  
> buffer. Eventually I stumbled across the HTMLPrintAction and its  
> corresponding HTMLPrinter. 
>  
> Inside of HTMLPrinter I noticed the AddLineNumbers method which was  
> performing manipulations based on a raw MemoryBuffer which looked about  
> right. It eventually led me to this prototype code: 
>  
> /// Write out the entire comment based on the source range. 
> bool IndentingCommentHandler::HandleComment(Preprocessor &pp,  
> SourceRange rng) 
> { 
>      FileID FID = pp.getSourceManager().getMainFileID(); 
>      const llvm::MemoryBuffer *MB = pp.getSourceManager().getBuffer(FID); 
>  
>      int size = rng.getEnd().getRawEncoding() -  
> rng.getBegin().getRawEncoding(); 
>      char *Buff = (char *)calloc(size+1, sizeof(char)); 
>  
>      const char *itBeg = MB->getBufferStart(); 
>      const char *itEnd = MB->getBufferStart(); 
>  
>      unsigned int offset =  
> pp.getSourceManager().getLocForStartOfFile(FID).getRawEncoding(); 
>  
>      // Adjust pointers to account for the FileID's offset in the Source  
> manager. 
>      itBeg -= offset; 
>      itEnd -= offset; 
>  
>      // Adjust pointers relative to where the comment actually begins and ends 
>      for( int i = 0; i < rng.getBegin().getRawEncoding(); i++) 
>      { 
>          ++itBeg; 
>          ++itEnd; 
>      } 
>  
>      for( int i = 0; i < size; ++i) 
>      { 
>          ++itEnd; 
>      } 
>  
>      std::copy(itBeg, itEnd, Buff); 
>      std::cout << "=============================" << std::endl; 
>      std::cout << Buff << std::endl; 
>      free(Buff); 
>      return false; 
> } 
>  
>  
> This seems to work for a few test cases I've tried it against but also  
> felt a bit verbose. I'm wondering, did I do something stupid? Did I  
> overlook a better or more proper way of this? 
>  
> You should generally avoid using SourceLocation::getRawEncoding(), it  
> is only useful as opaque data, do not use it for offset info. 
> Check out SourceManager::getDecomposedLoc(SourceLocation); this returns  
> a pair of FileID/offset so you can arrive at a Buffer+offset for a  
> SourceLocation. 
>  
>  
> Many thanks for any help and guidance, 
> Larry Olson 
> (https://github.com/loarabia) 
>  
> _______________________________________________ 
> cfe-dev mailing list 
> cfe-dev at cs.uiuc.edu 
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev 
>  
 		 	   		  



More information about the cfe-dev mailing list