<div dir="ltr">Adding Reid and Nico who have been struggling with lexer / preprocessor compile times recently.<br><br><div class="gmail_quote"><div dir="ltr">On Mon, May 16, 2016 at 10:46 AM Андрей Серебро <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>So, I have implemented a prototype. For anyone interested, I
attach the patch (I used clang release 38 from github as startup
code).<br>
</div>
<div> </div>
<div>The prototype consumes 40% less time on preprocessing all boost
headers than original clang (780 seconds vs 1330). On real-world
source code patched clang seems to be usually faster than
original.</div>
<div> </div>
<div>The big problem with it is that it currently doesn't generate
proper source locations for expanded tokens. It doesn't affect the
final code, but of course it makes diagnostics hard in case of
errors. On the other hand, this situation is similar with inline
functions debugging: without special flag, debugging will not be
that easy. </div>
<div> </div>
<div>I have also measured timings for patched clang and clean clang,
but with removed information about expansion locations (for it
could happen, that all the profit came from switching off this
info). But it turned out, that patched is still 28% faster (1100
seconds vs 780 seconds). </div>
<div> </div>
<div>I think, it may be useful probably to have some flag in clang
that allows fast preprocessing, for sometimes profit can reach up
to x4 times! <br>
<br>
I'm pretty sure there are some bugs now I haven't yet recognized,
so any feedback is highly appreciated. <br>
</div>
<div> </div>
<div>25.03.2016, 12:05, "mats petersson"
<a href="mailto:mats@planetcatfish.com" target="_blank"><mats@planetcatfish.com></a>:</div></div><div bgcolor="#FFFFFF" text="#000000">
<blockquote type="cite">
<div>
<div>But even then, how much of the total time is expanding
macros, and how much is "reading and finding the actual files"
(and writing the output)?<br>
<br>
</div>
<div>I'm not saying this is not worth doing, I'm just trying to
avoid someone spending time on something that doesn't provide
benefit - I speak from experience, I've "optimized" code, and
then found that it didn't make any improvement at all - I've
also done work with gives 3-30x speedups by some simple
steps... So, measure, make improvement, measure. <br>
<br>
</div>
<div>Or, use `perf` on some typical usecase, and figure out
where the time goes... </div>
<div><br>
--</div>
Mats</div>
<div><br>
<div>On 25 March 2016 at 08:29, Yaron Keren <span><<a href="mailto:yaron.keren@gmail.com" target="_blank"></a><a href="mailto:yaron.keren@gmail.com" target="_blank">yaron.keren@gmail.com</a>></span>
wrote:<br>
<blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div>Once measured times for one of the Boost libraries
example, preprocessing (-E) was about 20% of total
compilation time. This is not typical in general but
quite common with Boost libraries as 100s-1000s files
may be included with tons of macros and nested macros.</div>
<div>
<div>
<div> </div>
<div> </div>
<div><br>
<div>
<div><span>2016-03-25 1</span>:29 GMT+02:00 Андрей
Серебро <span><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>:</div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:#cccccc;border-left-style:solid;padding-left:1ex">
<div>Hi Mats,</div>
<div> </div>
<div>Thanks for the reply. Yep, you are right,
the time should be measured and I guess I can
imagine the typical workflow</div>
<div>
<ul>
<li>implement prototype</li>
<li>take bunch of big real projects</li>
<li>compare preprocessing time for initial
and changed clang</li>
<li>make conclusion whether the idea is sane
or not</li>
</ul>
</div>
<div>About usage - probably, some IDEs can act
better for they need iteratively relex source
for correct autocomplete.</div>
<div> </div>
<div>What I'm also curious about is if somebody
already did something on this or had thought
about it.</div>
<div>If the idea was already thought (which I
guess is rather possible), it's interesting,
did somebody already prove it's useless? </div>
<div> </div>
<div>25.03.2016, 01:59, "mats petersson" <<a href="mailto:mats@planetcatfish.com" target="_blank"></a><a href="mailto:mats@planetcatfish.com" target="_blank">mats@planetcatfish.com</a>>:</div>
<div>
<div>
<blockquote type="cite">
<div>
<div>
<div>First, surely the right place for
this discussion is the cfe-dev
mailing list?<br>
<br>
</div>
Second, have you determined that this
is a noticeable amount of time when
compiling? I have no idea - in my
Pascal compiler, parsing the code is
~0.1%, codegen to IR ~1.9% and LLVM
98%. But I'm sure Clang is more
complex in many ways, so the
proportion is probably a bit different
- a measurement of the time spent
expanding macros would probably help
determine if it's worth doing or not.
<br>
<br>
--</div>
Mats</div>
<div><br>
<div>On 24 March 2016 at 22:17, Andy via
llvm-dev <span><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF">Hello, folks!<br>
<br>
Currently me with one other guy
are trying to play with clang. The
proposal may seem stupid, excuse
me, if it was already discussed,
we just want to try to implement
something useful which seems
absent for now.<br>
<br>
Ok, the idea. It seems interesting
to try to make lexer a little bit
more efficient in terms of macro
expanding by applying partial
expansion of macros. the idea is
that some libraries have rather
deeply nested macro definitions,
and each time lexer sees it in
code, it reexpands definition
fully. This seems to be overkill
sometimes, for rather often macros
are not redefined in code, so
expansion can be reused. <br>
<br>
Of course, the typical nesting is
rather low, but for example
BOOST_PP_REPEAT can cause such
situations. <br>
<br>
So, the question is, what do you
think about possible utility of
such research and the reasons for
you think so?<br>
</div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote>
</div>
</div>
</blockquote>
<div> </div>
<div> </div>
</div>
</div>
<div>-- </div>
<div>Regards,<br>
Andrei Serebro</div>
<div>tel. <a href="tel:%2B79111758381" target="_blank">+79111758381</a></div>
<div> </div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<div> </div>
<div> </div>
<div>-- </div>
<div>Regards,<br>
Andrei Serebro</div>
<div>tel. +79111758381</div>
<div> </div>
</div><div bgcolor="#FFFFFF" text="#000000"></div>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div></div>