[cfe-dev] Incremental parsing/compilation question

Douglas Gregor dgregor at apple.com
Fri Dec 2 07:43:55 PST 2011


On Dec 2, 2011, at 7:02 AM, Vassil Vassilev wrote:

> Hi,
> 
> In cling we have line-by-line input that comes from a terminal-like prompt.
> We do incremental compilation of the input.  The input lines come as 
> llvm::MemoryBuffers
> We compile each memory buffer by passing them to clang.  However, when clang
> parses a buffer it sees EOF in the end and destroys it's current lexer 
> and what not.
> 
> For example cling can have:
> [cling$] extern "C" int printf(const char* fmt, ...);
> [cling$] int i = 12;
> [cling$] printf("%d\n", i);
> 
> Every line comes in memory buffer containing \0 in the end.  Clang 
> considers that as
> an EOF.  Ideally I want to tell the parser that the parsing of the 
> translation unit is not
> yet done, but that it should be suspended until next user's input.
> 
> Correct me if I am wrong, but the best way of doing that would be to 
> implement a
> 'suspend' token.  When the lexer and parser see that 'suspend' token 
> they would stop
> as if it was EOF token but without deleting/cleaning anything, so that 
> the parsing could
> be restarted later with the same state.

'suspend' should probably just be a special handling of the 'eof' token, so that the parser/lexer doesn't tear everything down (but otherwise acts exactly the same). We don't want the parser or preprocessor to have to check 'is this eof or suspend?' every time it currently checks for eof.

> If this is the right approach what would be the best way to represent 
> the 'suspend'
> token?  i.e. which ascii char that would trigger the suspension?   (I'd 
> really like to have
> "$" but it is already part of an extension).


I suggest looking at how the code-completion token is created. It uses \0 + a file offset to distinguish between a \0 at the end of the buffer and an embedded \0 that is the code completion point.

	- Doug



More information about the cfe-dev mailing list