<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 26, 2015, at 11:58 PM, Manuel Klimek <<a href="mailto:klimek@google.com" class="">klimek@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">I'd be curious to learn more about why you need to have it in one file. Invalidation is something build systems usually take care of, and for those it seems mostly the same whether they invalidate one or two files (they already need to invalidate a multitude of files, like .o files, module files, linked libraries and binaries, and generated code).</div></div></blockquote><div><br class=""></div><div>With one exception that I know of, build systems don’t handle module files at all: Clang handles them all internally via file-level locking. Clang has logic to invalidate and rebuild module files when they go out of date, clean up stale module files that might be lying around in the module cache, etc. For an external tool to have to duplicate that work—managing another set of files that mirror the module files, with their own invalidation/cleanup rules—would be a buggy mess. Hence, putting that information directly into the module file let’s Clang handle the invalidation logic, and the tool just gets a scratch pad in there to store its per-module results.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="">I could see an argument for deployment, but for deployment of libraries you'll need the module and its headers anyway.</div><div class="">Given that we already have a way to embed sources into the modules, I can see that you might be able to build full deployable bundles, where a module file is the only thing you need, and it includes the headers, potentially .o files to link in, and other information (like the one you cite).</div></div></div></blockquote><div><br class=""></div><div>I’m not motivated by the deployment aspects, although you’re right that one could probably use module file extensions to help with some of them.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="">So, all that said, I'm mainly curious whether that's what you're aiming for :)</div></div></div></blockquote><div><br class=""></div><div>My primary motivation is that I need another kind of index stored within the module file. It naturally fits <i class="">in</i> the module file, because it cross-references the specific declarations stored within that module file (e.g., by their local declaration ID number) for fast lookup from an external tool. It’s completely natural to rebuild the index along with building the module file, since it’s indexing that module specifically.</div><div><br class=""></div><span class="Apple-tab-span" style="white-space:pre"> </span>- Doug</div><div><br class=""><blockquote type="cite" class=""><div class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Oct 27, 2015 at 5:17 AM Douglas Gregor via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">Hi all,<div class=""><br class=""></div><div class="">Modules provide a useful place to cache the results of parsing a library’s headers for later, efficient consumption. Clang does this for all of the information it gathers during parsing, including the ASTs, preprocessor state, lookup tables, comments, and so on, with the intent to minimize the amount of deserialization that occurs when a particular module is being used.</div><div class=""><br class=""></div><div class="">I’d like to extend this capability so that tools built on top of Clang can store their own information in the module file on disk. This mechanism would be used when there is information that can be computed at module-build time that would be expensive to recompute in every user of the module. For example, function summaries for the static analyzer, application-specific indexes that would require deserializing the entire module file to recompute. One could compute this information and put it into a separate file or database, but given that the information is naturally tied to modules—and generally needs to be invalidated at the same time the module file itself needs to be rebuilt—it makes more architectural sense to put that information directly in the module file when it is built, while the full AST is still efficiently accessible in memory.</div><div class=""><br class=""></div><div class="">A <i class="">module file extension</i> is a bit of custom logic that can piggy-back data into a module file. Each module file extension is described by a unique block name identifying the extension, as well as other metadata (major/minor version, user information string) about the extension itself. Each module file extension gets to write into its own separate extension block in the resulting module file, separate from the rest of the module file contents and from other extensions. To do that, it provides both a writer (that writes bitstream records into the output file) and a reader (that can read back those bitstream records).</div><div class=""><br class=""></div><div class="">I’ve attached an implementation of module file extensions. It sketches out the interface to a module file extension (see below, or check out ModuleFileExtension.h for the full interface) and implements a module file extension for testing purposes so we can illustrate the round-tripping of data through the module file format, matching of extension blocks written to extension blocks read, and so on. </div><div class=""><br class=""></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px" class=""><div class=""><div class=""><font face="Courier New" class="">/// Metadata for a module file extension.</font></div><div class=""><font face="Courier New" class="">struct ModuleFileExtensionMetadata {</font></div><div class=""><font face="Courier New" class=""> /// The name used to identify this particular extension block within</font></div><div class=""><font face="Courier New" class=""> /// the resulting module file. It should be unique to the particular</font></div><div class=""><font face="Courier New" class=""> /// extension, because this name will be used to match the name of</font></div><div class=""><font face="Courier New" class=""> /// an extension block to the appropriate reader.</font></div><div class=""><font face="Courier New" class=""> std::string BlockName;</font></div><div class=""><font face="Courier New" class=""><br class=""></font></div><div class=""><font face="Courier New" class=""> /// The major version of the extension data.</font></div><div class=""><font face="Courier New" class=""> unsigned MajorVersion;</font></div><div class=""><font face="Courier New" class=""><br class=""></font></div><div class=""><font face="Courier New" class=""> /// The minor version of the extension data.</font></div><div class=""><font face="Courier New" class=""> unsigned MinorVersion;</font></div><div class=""><font face="Courier New" class=""><br class=""></font></div><div class=""><font face="Courier New" class=""> /// A string containing additional user information that will be</font></div><div class=""><font face="Courier New" class=""> /// stored with the metadata.</font></div><div class=""><font face="Courier New" class=""> std::string UserInfo;</font></div><div class=""><font face="Courier New" class="">};</font></div><div class=""><font face="Courier New" class=""><br class=""></font></div><div class=""><font face="Courier New" class="">/// An abstract superclass that describes a custom extension to the</font></div></div><div class=""><div class=""><font face="Courier New" class="">/// module/precompiled header file format.</font></div></div><div class=""><div class=""><font face="Courier New" class="">///</font></div></div><div class=""><div class=""><font face="Courier New" class="">/// A module file extension can introduce additional information into</font></div></div><div class=""><div class=""><font face="Courier New" class="">/// compiled module files (.pcm) and precompiled headers (.pch) via a</font></div></div><div class=""><div class=""><font face="Courier New" class="">/// custom writer that can then be accessed via a custom reader when</font></div></div><div class=""><div class=""><font face="Courier New" class="">/// the module file or precompiled header is loaded.</font></div></div><div class=""><div class=""><font face="Courier New" class="">class ModuleFileExtension : public llvm::RefCountedBase<ModuleFileExtension> {</font></div></div><div class=""><div class=""><font face="Courier New" class="">public:</font></div></div><div class=""><div class=""><font face="Courier New" class=""> virtual ~ModuleFileExtension();</font></div></div><div class=""><div class=""><font face="Courier New" class=""><br class=""></font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// Retrieves the metadata for this module file extension.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> virtual ModuleFileExtensionMetadata getExtensionMetadata() const = 0;</font></div></div><div class=""><div class=""><font face="Courier New" class=""><br class=""></font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// Hash information about the presence of this extension into the</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// module hash code.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> ///</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// The module hash code is used to distinguish different variants</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// of a module that are incompatible. If the presence, absence, or</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// version of the module file extension should force the creation</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// of a separate set of module files, override this method to</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// combine that distinguishing information into the module hash</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// code.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> ///</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// The default implementation of this function simply returns the</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// hash code as given, so the presence/absence of this extension</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// does not distinguish module files.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> virtual llvm::hash_code hashExtension(llvm::hash_code Code) const;</font></div></div><div class=""><div class=""><font face="Courier New" class=""><br class=""></font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// Create a new module file extension writer, which will be</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// responsible for writing the extension contents into a particular</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// module file.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> virtual std::unique_ptr<ModuleFileExtensionWriter></font></div></div><div class=""><div class=""><font face="Courier New" class=""> createExtensionWriter() = 0;</font></div></div><div class=""><div class=""><font face="Courier New" class=""><br class=""></font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// Create a new module file extension reader, given the</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// metadata read from the block and the cursor into the extension</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// block.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> ///</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// May return null to indicate that an extension block with the</font></div></div><div class=""><div class=""><font face="Courier New" class=""> /// given metadata cannot be read.</font></div></div><div class=""><div class=""><font face="Courier New" class=""> virtual std::unique_ptr<ModuleFileExtensionReader></font></div></div><div class=""><div class=""><font face="Courier New" class=""> createExtensionReader(const ModuleFileExtensionMetadata &Metadata,</font></div></div><div class=""><div class=""><font face="Courier New" class=""> ASTReader &Reader, serialization::ModuleFile &Mod,</font></div></div><div class=""><div class=""><font face="Courier New" class=""> const llvm::BitstreamCursor &Stream) = 0;</font></div></div><div class=""><div class=""><font face="Courier New" class="">};</font></div></div><div class=""><font face="Courier New" class=""><br class=""></font></div><div class=""><font face="Courier New" class=""><div class="">/// Abstract base class that writes a module file extension block into</div><div class="">/// a module file.</div><div class="">class ModuleFileExtensionWriter {</div><div class=""> ModuleFileExtension *Extension;</div><div class=""><br class=""></div><div class="">protected:</div><div class=""> ModuleFileExtensionWriter(ModuleFileExtension *Extension)</div><div class=""> : Extension(Extension) { }</div><div class=""><br class=""></div><div class="">public:</div><div class=""> virtual ~ModuleFileExtensionWriter();</div><div class=""><br class=""></div><div class=""> /// Retrieve the module file extension with which this writer is</div><div class=""> /// associated.</div><div class=""> ModuleFileExtension *getExtension() const { return Extension; }</div><div class=""><br class=""></div><div class=""> /// Write the contents of the extension block into the given bitstream.</div><div class=""> ///</div><div class=""> /// Responsible for writing the contents of the extension into the</div><div class=""> /// given stream. All of the contents should be written into custom</div><div class=""> /// records with IDs >= FIRST_EXTENSION_RECORD_ID.</div><div class=""> virtual void writeExtensionContents(llvm::BitstreamWriter &Stream) = 0;</div><div class="">};</div><div class=""><br class=""></div><div class="">/// Abstract base class that reads a module file extension block from</div><div class="">/// a module file.</div><div class="">///</div><div class="">/// Subclasses </div><div class="">class ModuleFileExtensionReader {</div><div class=""> ModuleFileExtension *Extension;</div><div class=""><br class=""></div><div class="">protected:</div><div class=""> ModuleFileExtensionReader(ModuleFileExtension *Extension)</div><div class=""> : Extension(Extension) { }</div><div class=""><br class=""></div><div class="">public:</div><div class=""> /// Retrieve the module file extension with which this reader is</div><div class=""> /// associated.</div><div class=""> ModuleFileExtension *getExtension() const { return Extension; }</div><div class=""><br class=""></div><div class=""> virtual ~ModuleFileExtensionReader();</div><div class="">};</div></font></div></blockquote><div class=""><br class=""></div><div class="">I suspect that the Reader and Writer interfaces will grow somewhat as we get more clients, but this is a start.</div><div class=""><br class=""></div><div class="">Thoughts?</div><div class=""><br class=""></div><div class=""><span style="white-space:pre-wrap" class=""> </span>- Doug</div><div class=""><br class=""></div><div class=""></div></div><div style="word-wrap:break-word" class=""></div>_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class="">
</blockquote></div>
</div></blockquote></div><br class=""></body></html>