<div dir="ltr">Hans, Richard, and I spent some more time discussing this today, and we came to the conclusion that this could absolutely be built partially with existing modules functionality. In this case, by "module" I'm not referring to a chunk of serialized AST, I'm just referring to the in-memory data structures that clang uses to control name lookup.<div><br></div><div>The idea is that each .cpp file can be its own module, and all headers would be part of a global module. Each .cpp file is only allowed to look up names in the global module. My understanding is that this is where -fmodules-local-submodules-visibility comes into play, although I'm not clear on the details. This symbol hiding is the first part of what jumbo needs, and it's actually implemented similarly to the way it was done in the JumboSupport patch on github. It's basically filtering out declarations that aren't supposed to be visible during name lookup.<div><br></div><div>The second part is avoiding name mangling collisions. It seemed pretty simple to us to extend both name manglers to include a unique module id in the names of all internal linkage symbols, so 'static int f() { return 42; }' becomes _ZL1fv.1 (add .1, .2, etc). c++filt already knows how to demangle those, so that will just work. This wouldn't break any existing users, because after all, these are things with internal linkage, the names shouldn't matter as long as they look nice in the debugger.</div><div><br></div><div>The last thing is to make it so that all included headers not listed in the jumbo file (or perhaps on the command line) are in one global module. We weren't able to find a way to express this today with module maps, but I don't think it would be too hard to do.</div></div><div><br></div><div>---</div><div><br></div><div>We also discussed how we could, in the long run, get the compile time benefits of jumbo builds without the semantic changes. The basic idea is that every "modular header", i.e. a header that can successfully parse by itself with only command line macros defined, could be its own module. Again, we're not talking about AST serialization, just changing name lookup rules. It's just a module for name lookup purposes. In order for this to work, all code needs to follow very strict include-what-you-use rules: transitive includes wouldn't be visible from indirect users of a header. Obviously, we are not in this world today, but it's one we could work towards.</div><div><br></div><div>Once the codebase follows IWYU, then it shouldn't matter (barring bugs, of which I'm sure there will be many) what the jumbo factor is. Ignoring resource exhaustion, a build that succeeds with a jumbo factor of 50 should also succeed with a jumbo factor of 1. Devs can work locally with jumbo and not worry about forgetting includes that they happen to get transitively.</div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Apr 10, 2018 at 5:12 AM Mostyn Bramley-Moore via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><b style="font-weight:normal" id="m_7290354121272024733gmail-docs-internal-guid-a1e0adc9-af70-ec92-1330-18e90ab6309d"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Hi,</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">I am a member of a small group of Chromium developers who are working on adding a unity build[1] setup to Chromium[2], in order to reduce the project's long and ever-increasing compile times. We're calling these "jumbo" builds, because this term is not as overloaded as "unity".</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">We're slowly making progress, but find that a lot of our time is spent renaming things in anonymous namespaces- it would be much simpler if it was possible to automatically treat these as if they were file-local. Jens Widell has put together a proof-of-concept which appears to work reasonably well, it consists of a clang plugin and a small clang patch:</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><a href="https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1" target="_blank">https://github.com/jensl/llvm-project-20170507/tree/wip/jumbo-support/v1</a></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><a href="https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f" target="_blank">https://github.com/jensl/llvm-project-20170507/commit/a00d5ce3f20bf1c7a41145be8b7a3a478df9935f</a></span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">After building clang and the plugin, you generate jumbo source files that look like:</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">jumbo_source_1.cc:</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">#pragma jumbo</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">#include "real_source_file_1.cc"</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">#include "real_source_file_2.cc"</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">#include "real_source_file_3.cc"</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Then, you compile something like this:</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">clang++ -c jumbo_source_1.cc -Xclang -load -Xclang lib/JumboSupport.so -Xclang -add-plugin -Xclang jumbo-support</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">The plugin gives unique names[3] to the anonymous namespaces without otherwise changing their semantics, and also #undef's macros defined in each top-level source file before processing the next top-level source file. That way header files can still define macros that are used in multiple source files in the jumbo translation unit. Collisions between macros defined in header files and names used in other headers and other source files are still possible, but less likely.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">To show how much these two changes help, here's a patch to make Chromium's network code build in jumbo mode:</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><a href="https://chromium-review.googlesource.com/c/chromium/src/+/966523" target="_blank">https://chromium-review.googlesource.com/c/chromium/src/+/966523</a> (+352/-377 lines)</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">And here's the corresponding patch using the proof-of-concept JumboSupport plugin:</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap"><a href="https://chromium-review.googlesource.com/c/chromium/src/+/962062" target="_blank">https://chromium-review.googlesource.com/c/chromium/src/+/962062</a> (+53/-52 lines)</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">It seems clear that the version using the JumboSupport plugin would require less effort to create, review and merge into the codebase. We have a few other feature ideas, but these two changes seem to do most of the work for us.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">So now we're trying to figure out the best way forward- would a feature like this be welcome to the Clang project? And if so, how would you recommend that we go about it? We would prefer to do this in a way that does not require a locally patched Clang and could live with building a custom plugin, although implementing this entirely in Clang would be even better.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">Thanks,</span></p><div><b style="font-weight:normal"><br></b></div><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">-Mostyn.</span></p><br><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">[1] If you're not familiar with unity builds, the idea is to compile multiple source files per compiler invocation, reducing the overhead of processing header files (which can be surprisingly high). We do this by taking a list of the source files in a target and generating "jumbo" source files that #include multiple "real" source files, and then we feed these jumbo files to the compiler one at a time. This way, we don't prevent the usage of valuable build tools like ccache and icecc that only support a single source file on the command line.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">[2] Daniel Bratell has a summary of our progress jumbo-ifying the Chromium codebase here:</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a href="https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#" style="text-decoration:none" target="_blank"><span style="font-size:11pt;font-family:Arial;color:rgb(17,85,204);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:underline;vertical-align:baseline;white-space:pre-wrap">https://docs.google.com/document/d/19jGsZxh7DX8jkAKbL1nYBa5rcByUL2EeidnYsoXfsYQ/edit#</span></a></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap">[3] The JumboSupport plugin assigns names to the anonymous namespaces in a given file: foo::(anonymous namespace)::bar is replaced with a symbol name of the form foo::__anonymous_<number>::bar where <number> is unique to the file within the jumbo translation unit. Due to the internal linkage of these symbols, <number> does not need to be unique across multiple object files/jumbo source files.</span></p></b><br></div>-- <br><div class="m_7290354121272024733gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr">Mostyn Bramley-Moore<div><div>Vewd Software</div><div><a href="mailto:mostynb@opera.com" target="_blank">mostynb@vewd.com</a></div></div></div></div></div></div></div></div></div></div>
</div>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div>