<div dir="ltr">Adding Reid and Nico who have been struggling with lexer / preprocessor compile times recently.<br><br><div class="gmail_quote"><div dir="ltr">On Mon, May 16, 2016 at 10:46 AM Андрей Серебро <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000">

    <div>So, I have implemented a prototype. For anyone interested, I

      attach the patch (I used clang release 38 from github as startup

      code).<br>

    </div>

    <div> </div>

    <div>The prototype consumes 40% less time on preprocessing all boost

      headers than original clang (780 seconds vs 1330). On real-world

      source code patched clang seems to be usually faster than

      original.</div>

    <div> </div>

    <div>The big problem with it is that it currently doesn't generate

      proper source locations for expanded tokens. It doesn't affect the

      final code, but of course it makes diagnostics hard in case of

      errors. On the other hand, this situation is similar with inline

      functions debugging: without special flag, debugging will not be

      that easy. </div>

    <div> </div>

    <div>I have also measured timings for patched clang and clean clang,

      but with removed information about expansion locations (for it

      could happen, that all the profit came from switching off this

      info). But it turned out, that patched is still 28% faster (1100

      seconds vs 780 seconds). </div>

    <div> </div>

    <div>I think, it may be useful probably to have some flag in clang

      that allows fast preprocessing, for sometimes profit can reach up

      to x4 times! <br>

      <br>

      I'm pretty sure there are some bugs now I haven't yet recognized,

      so any feedback is highly appreciated. <br>

    </div>

    <div> </div>

    <div>25.03.2016, 12:05, "mats petersson"

      <a href="mailto:mats@planetcatfish.com" target="_blank"><mats@planetcatfish.com></a>:</div></div><div bgcolor="#FFFFFF" text="#000000">

    <blockquote type="cite">

      <div>

        <div>But even then, how much of the total time is expanding

          macros, and how much is "reading and finding the actual files"

          (and writing the output)?<br>

          <br>

        </div>

        <div>I'm not saying this is not worth doing, I'm just trying to

          avoid someone spending time on something that doesn't provide

          benefit - I speak from experience, I've "optimized" code, and

          then found that it didn't make any improvement at all - I've

          also done work with gives 3-30x speedups by some simple

          steps... So, measure, make improvement, measure. <br>

          <br>

        </div>

        <div>Or, use `perf` on some typical usecase, and figure out

          where the time goes... </div>

        <div><br>

          --</div>

        Mats</div>

      <div><br>

        <div>On 25 March 2016 at 08:29, Yaron Keren <span><<a href="mailto:yaron.keren@gmail.com" target="_blank"></a><a href="mailto:yaron.keren@gmail.com" target="_blank">yaron.keren@gmail.com</a>></span>

          wrote:<br>

          <blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div>

              <div>Once measured times for one of the Boost libraries

                example, preprocessing (-E) was about 20% of total

                compilation time. This is not typical in general but

                quite common with Boost libraries as 100s-1000s files

                may be included with tons of macros and nested macros.</div>

              <div>

                <div>

                  <div> </div>

                  <div> </div>

                  <div><br>

                    <div>

                      <div><span>2016-03-25 1</span>:29 GMT+02:00 Андрей

                        Серебро <span><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>:</div>

                      <blockquote style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:#cccccc;border-left-style:solid;padding-left:1ex">

                        <div>Hi Mats,</div>

                        <div> </div>

                        <div>Thanks for the reply. Yep, you are right,

                          the time should be measured and I guess I can

                          imagine the typical workflow</div>

                        <div>

                          <ul>

                            <li>implement prototype</li>

                            <li>take bunch of big real projects</li>

                            <li>compare preprocessing time for initial

                              and changed clang</li>

                            <li>make conclusion whether the idea is sane

                              or not</li>

                          </ul>

                        </div>

                        <div>About usage - probably, some IDEs can act

                          better for they need iteratively relex source

                          for correct autocomplete.</div>

                        <div> </div>

                        <div>What I'm also curious about is if somebody

                          already did something on this or had thought

                          about it.</div>

                        <div>If the idea was already thought (which I

                          guess is rather possible), it's interesting,

                          did somebody already prove it's useless? </div>

                        <div> </div>

                        <div>25.03.2016, 01:59, "mats petersson" <<a href="mailto:mats@planetcatfish.com" target="_blank"></a><a href="mailto:mats@planetcatfish.com" target="_blank">mats@planetcatfish.com</a>>:</div>

                        <div>

                          <div>

                            <blockquote type="cite">

                              <div>

                                <div>

                                  <div>First, surely the right place for

                                    this discussion is the cfe-dev

                                    mailing list?<br>

                                    <br>

                                  </div>

                                  Second, have you determined that this

                                  is a noticeable amount of time when

                                  compiling? I have no idea - in my

                                  Pascal compiler, parsing the code is

                                  ~0.1%, codegen to IR ~1.9% and LLVM

                                  98%. But I'm sure Clang is more

                                  complex in many ways, so the

                                  proportion is probably a bit different

                                  - a measurement of the time spent

                                  expanding macros would probably help

                                  determine if it's worth doing or not.

                                  <br>

                                  <br>

                                  --</div>

                                Mats</div>

                              <div><br>

                                <div>On 24 March 2016 at 22:17, Andy via

                                  llvm-dev <span><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>

                                  wrote:<br>

                                  <blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc solid;padding-left:1ex">

                                    <div bgcolor="#FFFFFF">Hello, folks!<br>

                                      <br>

                                      Currently me with one other guy

                                      are trying to play with clang. The

                                      proposal may seem stupid, excuse

                                      me, if it was already discussed,

                                      we just want to try to implement

                                      something useful which seems

                                      absent for now.<br>

                                      <br>

                                      Ok, the idea. It seems interesting

                                      to try to make lexer a little bit

                                      more efficient in terms of macro

                                      expanding by applying partial

                                      expansion of macros. the idea is

                                      that some libraries have rather

                                      deeply nested macro definitions,

                                      and each time lexer sees it in

                                      code, it reexpands definition

                                      fully. This seems to be overkill

                                      sometimes, for rather often macros

                                      are not redefined in code, so

                                      expansion can be reused. <br>

                                      <br>

                                      Of course, the typical nesting is

                                      rather low, but for example

                                      BOOST_PP_REPEAT can cause such

                                      situations. <br>

                                      <br>

                                      So, the question is, what do you

                                      think about possible utility of

                                      such research and the reasons for

                                      you think so?<br>

                                    </div>

                                    <br>

_______________________________________________<br>

                                    LLVM Developers mailing list<br>

                                    <a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

                                    <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></blockquote>

                                </div>

                              </div>

                            </blockquote>

                            <div> </div>

                            <div> </div>

                          </div>

                        </div>

                        <div>-- </div>

                        <div>Regards,<br>

                          Andrei Serebro</div>

                        <div>tel. <a href="tel:%2B79111758381" target="_blank">+79111758381</a></div>

                        <div> </div>

                        <br>

                        _______________________________________________<br>

                        LLVM Developers mailing list<br>

                        <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

                        <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

                      </blockquote>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </blockquote>

        </div>

      </div>

    </blockquote>

    <div> </div>

    <div> </div>

    <div>-- </div>

    <div>Regards,<br>

      Andrei Serebro</div>

    <div>tel. +79111758381</div>

    <div> </div>

  </div><div bgcolor="#FFFFFF" text="#000000"></div>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div></div>