<div dir="ltr">In r247018 I get a ~8x speedup on a synthetic benchmark I tried. Can you validate this fixes the regression?</div><br><div class="gmail_quote"><div dir="ltr">On Sat, Sep 5, 2015 at 12:56 AM Hans Wennborg <<a href="mailto:hans@chromium.org">hans@chromium.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Aug 14, 2015 at 2:55 AM, Manuel Klimek via cfe-commits<br>

<<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>> wrote:<br>

> Author: klimek<br>

> Date: Fri Aug 14 04:55:36 2015<br>

> New Revision: 245036<br>

><br>

> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=245036&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=245036&view=rev</a><br>

> Log:<br>

> Add structed way to express command line options in the compilation database.<br>

><br>

> Currently, arguments are passed via the string attribute 'command',<br>

> assuming a shell-escaped / quoted command line to extract the original<br>

> arguments. This works well enough on Unix systems, but turns out to be<br>

> problematic for Windows tools to generate.<br>

><br>

> This CL adds a new attribute 'arguments', an array of strings, which<br>

> specifies the exact command line arguments. If 'arguments' is available<br>

> in the compilation database, it is preferred to 'commands'.<br>

><br>

> Currently there is no plan to retire 'commands': there are enough<br>

> different use cases where users want to create their own mechanism for<br>

> creating compilation databases, that it doesn't make sense to force them<br>

> all to implement shell command line parsing.<br>

><br>

> Patch by Daniel Dilts.<br>

<br>

This seems to have caused a bad performance regression for a tool we<br>

use in Chromium. On the file I tried, run-time went from 0.4 s to 3.0<br>

s (7.5x slow-down), and peak memory usage from 42 MB to 366 MB (8.7x).<br>

<br>

I suspect what's happened is that JSONCompilationDatabase::parse()<br>

became significantly slower because it's now calling<br>

unescapeCommandLine() and allocating a std::vector on each "command"<br>

field during parsing, whereas previously the code wouldn't do that<br>

until JSONCompilationDatabase::getCommands() was called on a specific<br>

file or set of files.<br>

<br>

One idea for fixing this would be to make the second part of<br>

CompileCommandRef be a yaml::Node pointer, and look at the node type:<br>

if it's a ScalarNode, it contains "commands" and needs to be<br>

unescaped; if it's a SequenceNode, it's "arguments".<br>

</blockquote></div>