[cfe-dev] Patch for main file ID and stat cache lookup issues when loading AST files

Tom Honermann thonermann at coverity.com
Fri Jan 6 21:25:10 PST 2012

On 1/5/2012 4:43 PM, Argyrios Kyrtzidis wrote:
> On Dec 20, 2011, at 11:10 AM, Tom Honermann wrote:
>> Attached are two patches (svn diff format) that correct two issues
>> encountered when loading AST files generated by Clang with the
>> '-emit-ast' option.
>> Problem 1: Loading an AST file via ASTUnit::LoadFromASTFile() sets
>> ASTUnit::MainFileIsAST to true so that calls to
>> ASTUnit::isMainFileAST() return true.  However,
>> SourceManager::MainFileID does not get restored.  This is addressed
>> by clang-main-file-id.patch
> Addressed in r147612.

Thank you!

>> Problem 2: When the main source file is compiled using a relative
>> path (ie, 'clang -emit-ast file.c'), filesystem stat lookups use
>> the relative path exactly as passed on the command line.  This
>> results in the stat cache populated by ASTWriter via
>> MemorizeStatCalls using the relative path as the key to the stat
>> cache for the main file.  However, the main source file name is
>> also stored in the AST file using an absolute path as the "original
>> file".  When the AST file is later read by ASTReader, the original
>> (absolute) path is used for filesystem stat lookup resulting in a
>> cache miss for the stat cache because the absolute path doesn't
>> match (string comparison) the relative path used as a key in the
>> stat cache.  In this case, the real filesystem is queried which may
>> return stat data that does not reflect the stat values from the
>> time of the compilation if the filesystem stat values have changed.
>> This is addressed by clang-main-file-stat-cache.patch which simply
>> converts input file names to absolute paths before passing them on
>> to FrontendAction::BeginSourceFile().
> Ever since we made the AST reader to check all the files (as you
> noticed) for robustness, the stored stat cache that the ASTWriter
> populates became irrelevant, I'll probably remove it from the AST
> file. This is not blocking you for something, right ?

The change to check all the files is not always desirable.  It makes 
perfect sense when one wants to ensure a PCH file is up to date, but is 
not necessarily desirable for other uses of AST files.  With Clang 2.9, 
one can generate an AST file, modify/delete all the source files used to 
generate the AST file, and still use the AST file for code generation, 
analysis, etc... later on.  That no longer works with 3.0 without first 
disabling validation (sort of - deleted files don't fail the load, but 
modified files do).  For those like myself using 
ASTUnit::LoadFromASTFile(), there are currently no options to disable 
validation without changing the source code.

For my purposes, the stat cache is important so that I can retrieve the 
file size, modification time, etc... used to create the AST file when 
analyzing the AST file on a different machine or after the source files 
have been modified or removed.  In my case, missing or modified source 
files are more of a warning situation, not a hard error as I don't 
always need access to the original source.


More information about the cfe-dev mailing list