[llvm] r212816 - raw_svector_ostream: grow and reserve atomically

Fri Jul 11 13:40:34 PDT 2014

On 11/07/2014 22:28, Sean Silva wrote:
>
>
>
> On Fri, Jul 11, 2014 at 12:31 PM, Alp Toker <alp at nuanti.com 
> <mailto:alp at nuanti.com>> wrote:
>
>
>     On 11/07/2014 19:45, Chandler Carruth wrote:
>
>
>         On Fri, Jul 11, 2014 at 7:02 AM, Alp Toker <alp at nuanti.com
>         <mailto:alp at nuanti.com> <mailto:alp at nuanti.com
>         <mailto:alp at nuanti.com>>> wrote:
>
>             Author: alp
>             Date: Fri Jul 11 09:02:04 2014
>             New Revision: 212816
>
>             URL: http://llvm.org/viewvc/llvm-project?rev=212816&view=rev
>             Log:
>             raw_svector_ostream: grow and reserve atomically
>
>             Including the scratch buffer size in the initial reservation
>             eliminates the
>             subsequent malloc+move operation and offers a healthier
>         constant
>             growth with
>             less memory wastage.
>
>
>         What benchmarks did you run to measure the memory waste, and
>         what were the numbers? I think it is really important to
>         provide these kinds of details with potentially performance
>         impacting changes like this.
>
>
>     I've been getting some good timings with a synthetic clang
>     syntax-only diagnostics engine stress test (10 runs, Before:
>     0m3.317s After: 0m3.094s). I can't be sure this is a win for every
>     workload so I've boiled the commit down into a cleanup in r212837.
>
>
> FWIW, I think that's a fairly poor benchmark, even among synthetic 
> benchmarks. The diagnostics stuff is output for the human user,

We're collecting diagnostics across a large, changing source tree in 
real time, so the timing infrastructure I have here is all based around 
that.

> and so there is a low bar past which all speed improvements are 
> unnecessary: the input rate of the human brain, and I think we are 
> already well past that (I like to call this sort of code "human speed" 
> code).

That sounds like a different use case.

> I'm not sure that we really do much string manipulation in the 
> "machine speed" part of the code (i.e. stuff that doesn't have the 
> human brain's input speed as a limit past which there are basically no 
> returns) so I can't suggest a good workload. The only thing I can 
> immediately think of is name mangling in clang.

I agree macro-optimisations are more important. One big time-saver we're 
looking at is collecting all diagnostics (similar to -Weverything) and 
performing mapping after compilation based on interactive queries, 
rather than having to rebuild.

>
> -- Sean Silva
>
>
>
>     I'll post my thoughts shortly so we can discuss ideas to resolve
>     that old FIXME:
>
>     raw_svector_ostream::~raw_svector_ostream() {
>       // FIXME: Prevent resizing during this flush().
>       flush();
>     }
>
>     My secondary interest here is to cook up a seekable
>     raw_svector_ostream so we can run more of the compilation in
>     memory / via API without file descriptors, where raw_fd_ostream is
>     currently required. Beware, patches with even more new ostream
>     subclasses are getting proposed shortly ;-)
>
>     Alp.
>
>
>     -- 
>     http://www.nuanti.com
>     the browser experts
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>

-- 
http://www.nuanti.com
the browser experts