PLEASE REVIEW: modularize: new preprocessor conditional directive checking feature

Thompson, John John_Thompson at playstation.sony.com
Mon Jun 17 13:15:54 PDT 2013


Sean,

Thanks again for your help.

Sorry, I overlooked revising the function that was doing some old C-style string handling.  I've addressed that now (in the new ModularizePPDirective::print function).

The only other item I didn't address was the macro substitutions function, which was intentional as I described.  I've looked into this further, but I'd still like to punt on it for now.  Because there isn't such a function already, it appears the only clean way to do it would be to instantiate a new Lexer and Preprocessor object, linking the new preprocessor to the old so it can find the macros (there seems to be a pointer for this).  It would also require adding at least a new Lexer constructor, so it can set up the buffer pointers and correct state.  The constructor for lexing pragmas is close, but it sets the lexer in raw mode, which I don't want.  There's also the problem that it will call the MacroExpanded handler again, and I also worry there might be other side-effects due to the preprocessor and lexer states being changed inadvertently.  Since this is an external tool, I'm kind of thinking that it would be simpler to just use the function I put in for now, and I will look into it some more when I try to figure out how to do the function-style macro expansion, possibly using the existing code in the Preprocessor class.

Regarding breaking the patch up, it might be easier to consider the changes in three parts:


1.       The new classes.

2.       The changes to Modularize.cpp.

3.       The changes to the tests.

I'll provide these in separate patch files, though the associated changes will need to go in at the same time, however.

The new classes (mod_2013_06_17_new_patch.txt).  I should give some description of the new classes added and how they work together:


1.       ModularizePPCallbacks - Derives from the Clang PPCallbacks class to track preprocessor actions, such as changing files and handling preprocessor directives and macro expansions.  It has to figure out when a new header file is entered and left, as the provided handler is not particularly clear about it.  It also stores a map of macro expansions obtained from the MacroExpands callback, for use by a function that effectively preprocesses a conditional.  It handles the top-level aspects of collecting header file instance information, and tracking the preprocessor conditional directives.

2.       ModularizePPDirective - Stores information about one preprocessor directive instance, presently limited to #if, #elif, #ifdef, and #ifndef, since that is all modularize needs for now.  It stores the source file line number, a directive kind code, and both the unpreprocessed and preprocessed conditional source code snippet.

3.       ModularizeHeaderFile - Store a header file name and a vector of ModularizePPDirective instances collected for that header file.

4.       ModularizeHeaderInstance - Stores a pointer to a ModularizeHeaderFile for a header, and a vector of header file names for the headers from the modularize header list that reference the particular header, either directly or indirectly via some nested include.  If separate instances of the header are encountered when modularize processes its header list, if the preprocessed directive conditionals stored in the ModularizePPDirective vector are the same for and existing ModularizeHeaderFile object, the top-level header name is added to the instance, effectively reusing the ModularizeHeaderFile object.  If a header is seen for the first time, or if the preprocessed conditionals for the stored directives don't match those of an instance of the header seen before, a new ModularizeHeaderInstance object is created and saved.

5.       ModularizeHeaderTracker - Tracks the instances of one particular header.  It stores the header name and a vector of ModularizeHeaderInstance's.  If all instances of a header seen have the same conditionals after preprocessing, there will only be one ModularizeHeaderInstance.  If one or more conditionals were difference, there will be two or more instance objects saved.

6.       ModularizeMasterHeaderTracker - Stores a map of all the ModularizeHeaderTracker objects, and provides an "addHeaderFile" function for handling a header file, and a "report" function for outputting the warnings about the preprocessor conditional directive mismatches.

The changes to Modularize.cpp (mod_2013_06_17_modularize_patch.txt):


1.       Add an option for disabling the preprocessing consistency checking.  This is a fallback, in case of problems with the mechanism, or to reduce warnings volume.

2.       Set up a ModularizePPCallbacks object for tracking the preprocessor.  This is done in the CollectEntitiesConsumer object.

3.       Set up a ModularizeMasterHeaderTracker for storing the header instance data.  This is done in the CollectEntitiesConsumer object.

4.       Call the ModularizeMasterHeaderTracker::report function to report any warnings about the preprocessor conditional directive mismatches.

5.       Fixed some naming convention issues.

The changes to the tests (mod_2013_06_17_test_patch.txt):


1.       Some new lines in a couple of files for the new feature.

I'm also including a zip with the changed files.

I'm hoping I can check this in soon, as it makes me nervous to sit on so much, and makes it harder to continue experimenting.  Since this is still an experimental tool, I'm hoping we can improve it in incremental steps.

You mentioned you have other suggestions too.  Please do feel free to send them.

One thing I'm aware of  is that the collections are probably leaking the objects' memory.  I can fix that if necessary.

Thanks.

-John

From: Sean Silva [mailto:silvas at purdue.edu]
Sent: Wednesday, June 12, 2013 2:55 PM
To: Thompson, John
Cc: cfe-commits at cs.uiuc.edu; John.Thompson.JTSoftware at gmail.com
Subject: Re: PLEASE REVIEW: modularize: new preprocessor conditional directive checking feature


I recommend looking at the code and trying to break it down into a bunch of small, "obvious", incremental changes to the existing code. It's really hard to review such a huge patch.
On Wed, Jun 12, 2013 at 1:41 PM, Thompson, John <John_Thompson at playstation.sony.com<mailto:John_Thompson at playstation.sony.com>> wrote:
Thanks a bunch, Sean.

I've cleaned up the sources a lot per your comments, but I still need to replace the macro substitution function, which will mean I can do away with the symbol storing too.  I couldn't find a function such as I need, i.e. a function in Preprocessor that will either take a string or source range as input and produce either a string or token vector, with macro substitutions done using the current preprocessor state, all without adversely affecting the preprocessor state.  But it's a huge class, so I might be missing it.  Do you or anyone know of such a function?  Otherwise, I'll work on a separate patch to add one to Preprocessor.
Not off the top of my head. You may want to ask on cfe-dev for the best way to approach this.
Also, in addition to the macro substitution, I think I can relatively easily add a similar message to the "header (file) has different contents depending on how it was included" error, relating the error to the differing preprocessor conditional.  Likewise, an option for showing the include hierarchy would help.

Anyway, may I check this in as a working intermediate step, in case folks find it useful?

Did you send the right patch? You don't seem to have addressed many of my review comments. Please revisit my last message and make sure that you have addressed those issues. There are numerous other changes I would like to suggest but the ones I pointed out earlier need to be fixed before the other changes can be meaningfully discussed.


-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130617/f3f77f50/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mod_2013_06_17_modularize_patch.txt
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130617/f3f77f50/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mod_2013_06_17_new_patch.txt
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130617/f3f77f50/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mod_2013_06_17_test_patch.txt
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130617/f3f77f50/attachment-0002.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mod_2013_06_17.zip
Type: application/x-zip-compressed
Size: 27191 bytes
Desc: mod_2013_06_17.zip
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130617/f3f77f50/attachment.bin>


More information about the cfe-commits mailing list