[cfe-dev] RFC: Preprocessor option to assist with parsing a single file only

Argyrios Kyrtzidis via cfe-dev cfe-dev at lists.llvm.org
Wed Jun 14 18:25:49 PDT 2017

Hey all,

In r305044 I introduced a preprocessor option (bool SingleFileParseMode) and clang-c/Index.h enumerator (CXTranslationUnit_SingleFileParse) to assist with ‘parsing a single file only’. I’m going to provide some details and context on why such parsing is useful and why a new option is necessary.

Parsing a single file (essentially parse it normally but without including any other headers) is useful as a way to determine the global symbols that exist in the source files, in an inaccurate but ‘lightning-super-fast’ mode. For example, if the source is like this:

@implementation Foo
-(void)testSomething {}
-(NSString*)returnIt { return @“blah”; }

The parser can determine that there is an ObjC @implementation named ‘Foo’ with 2 methods, -testSomething, and -returnIt. Even if no SDK header gets included and ‘NSString’ becomes unresolved, the parser can still provide the associated global symbols.

In general terms, think of this like approximating the inaccurate parsing that something like SublimeText is doing, where there’s no preprocessor or precise typechecking but it can still provide you with a list of symbols and some rudimentary jump-to-definition.

We’ve used this for a while now in Xcode to do something like ‘fast-scanning’ specifically for ObjC unit tests (*). This allows us to show the available unit tests almost immediately once you open a project, without waiting for the full-accurate indexing to complete.
If the ‘fast-scan’ is missing something, e.g. due to preprocessor directives or macros, it will still show up once the accurate indexing catches up.

To clarify, this is working without any modifications to clang, we were just using libclang to parse the file containing the unit tests and did not pass any search paths, which had the practical effect of not including headers. So why adding the option now ?

This is due to the limitation of the 'fast scan' not seeing symbols inside preprocessor directives. For example, with code like this:


@implementation Foo
-(void)testSomething {}


‘ENABLE_FOO_TESTS’ is not defined so the preprocessor skips this block and we miss getting these tests via the ‘fast scan’. Here’s what I’d like to propose:

If ‘SingleFileParseMode’ is true, the preprocessor will treat undefined identifiers in preprocessor directives specially. If the directive is making use of an undefined identifier then it will cause it to ignore the directive and parse all blocks of the directive (the #if block, and the #else one as well).
If the directive is using literals like:

#if 0

#if 1

Or making use of defined macros then there’s no change of behavior.

With such a change, in this ‘fast-scan-inaccurate-mode’ we’ll be able to gather the symbols that exist in preprocessor directives like the "#if ENABLE_FOO_TESTS” example.

Let me know what you think!

(*) Dealing only with detection of ObjC unit tests has a restricted scope and clang was well equipped to help with unmodified. If we want to extend ‘fast/inaccurate’ parsing and try to gather such symbol info from all files, clang would need to be enhanced to improve its error recovery and not drop valuable information from its AST when there are compiler errors. But this is a discussion for another thread at some later point in future.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170614/4512b1b7/attachment.html>

More information about the cfe-dev mailing list