<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 21, 2016, at 12:13 PM, Gábor Horváth <<a href="mailto:xazax.hun@gmail.com" class="">xazax.hun@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi!<br class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On 20 October 2016 at 18:12, Mehdi Amini via cfe-dev <span dir="ltr" class=""><<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br class="">
> On Oct 20, 2016, at 2:23 AM, Ilya Palachev via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:<br class="">
><br class="">
> Hi,<br class="">
><br class="">
> It seems that compressing AST files with simple "gzip --fast" makes them 30-40% smaller.<br class="">
> So the questions are:<br class="">
> 1. Is current AST serialization format really non-compressed (only abbreviations in bit stream format)?<br class="">
> 2. Is it worthwhile to compress AST by default (with -emit-ast)?<br class="">
> 3. Will this break things like PCH?<br class="">
> 4. What's the current trade-off between PCH compile time and disk usage? If AST compression makes compilation a bit slower, but reduces the disk usage significantly, will this be appropriate for users or not?<br class="">
<br class="">
</span>Is there a need for this disk usage? If the main use of AST files is C++ modules / PCH, what is a typical size for a module cache directory?<br class="">
(Compression is expensive)<br class=""></blockquote><div class=""><br class=""><br class=""></div><div class="">In some cases compression can actually improve the peformance, because in some cases the bottleneck is the I/O, and less data read from the disk and a fast decompression can be beneficial to the overall performance. </div></div></div></div></div></div></blockquote><div><br class=""></div><div>Are you speculating or do you have numbers on the actual AST writer/reader?</div><div><br class=""></div><div>Also a good starting point would be to consider not storing the AST as “blob” in the bitcode but using proper abbrev.</div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div class="">In case someone wants to do a whole project analysis on merged ASTs, this compression can be a very significant saving. Dumping all of LLVM and Clang TUs to the disk occupies about 45 GB of disk space at the moment.<br class=""></div></div></div></div></div></div></blockquote><div><br class=""></div><div>Sure, adding a compression layer on top for this particular application seems interesting, but you don’t need to have it on *by default* to support your use case though. </div><div>Having it always-on would require as a starting point to look closely at the impact on memory/time when including modules for example.</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><div class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="">
—<br class="">
Mehdi<br class="">
<div class="HOEnZb"><div class="h5"><br class="">
<br class="">
><br class="">
> LLVM already has a support for compression (functions compress/uncompress in include/llvm/Support/<wbr class="">Compression.h).<br class="">
><br class="">
> Best regards,<br class="">
> Ilya Palachev<br class="">
> ______________________________<wbr class="">_________________<br class="">
> cfe-dev mailing list<br class="">
> <a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/cfe-dev</a><br class="">
<br class="">
______________________________<wbr class="">_________________<br class="">
cfe-dev mailing list<br class="">
<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/cfe-dev</a><br class="">
</div></div></blockquote></div><br class=""></div></div></div>
</div></blockquote></div><br class=""></body></html>