<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">This thread is deep enough and the

      start of it confrontational enough, that I doubt enough people are

      reading this deep.  Please rephrase this as a separate RFC to

      ensure visibility.  <br>

      <br>

      For the record, the overall direction your sketching seems

      entirely reasonable to me.  <br>

      <br>

      Philip<br>

      <br>

      On 08/18/2015 10:31 PM, deadal nix via llvm-dev wrote:<br>

    </div>

    <blockquote

cite="mid:CANGV3T1DeTU0Fj=Zu27Npw2qCG9-LB=SmZKSstdjgFWUSmefFQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div>

          <div>

            <div>

              <div>

                <div>

                  <div>

                    <div>

                      <div>

                        <div>It is pretty clear people need this. Let's

                          get this moving.<br>

                          <br>

                        </div>

                        I'll try to sum up the point that have been made

                        and I'll try to address them carefully.<br>

                      </div>

                      <br>

                      1/ There is no good solution for large aggregates.<br>

                    </div>

                    That is true. However, I don't think this is a

                    reason to not address smaller aggregates, as they

                    appear to be needed. Realistically, the proportion

                    of aggregates that are very large is small, and

                    there is no expectation that such a thing would map

                    nicely to the hardware anyway (the hardware won't

                    have enough registers to load it all anyway). I do

                    think this is reasonable to expect a reasonable

                    handling of relatively small aggregates like fat

                    pointers while accepting that larges ones will be

                    inefficient.<br>

                    <br>

                  </div>

                  <div>This limitation is not unique to the current

                    discussion, as SROA suffer from the same limitation.<br>

                  </div>

                  <div>It is possible to disable to transformation for

                    aggregates that are too large if this is too big of

                    a concern. It should maybe also be done for SROA.<br>

                  </div>

                  <div><br>

                  </div>

                  2/ Slicing the aggregate break the semantic of

                  atomic/volatile.<br>

                </div>

                That is true. It means slicing the aggregate should not

                be done for atomic/volatile. It doesn't mean this should

                not be done for regular ones as it is reasonable to

                handle atomic/volatile differently. After all, they have

                different semantic.<br>

                <br>

              </div>

              3/ Not slicing can create scalar that aren't supported by

              the target. This is undesirable.<br>

            </div>

            Indeed. But as always, the important question is compared to

            what ?<br>

            <br>

          </div>

          The hardware has no notion of aggregate, so an aggregate or a

          large scalar ends up both requiring legalization. Doing the

          transformation is still beneficial :<br>

        </div>

         - Some aggregates will generate valid scalars. For such

        aggregate, this is 100% win.<br>

        <div>

          <div>

            <div>

              <div>

                <div>

                  <div>

                    <div>

                      <div> - For aggregate that won't, the situation is

                        still better as various optimization passes will

                        be able to handle the load in a sensible manner.<br>

                      </div>

                      <div> - The transformation never make the

                        situation worse than it is to begin with.<br>

                        <br>

                      </div>

                      <div>On previous discussion, Hal Finkel seemed to

                        think that the scalar solution is preferable to

                        the slicing one.<br>

                        <br>

                      </div>

                      <div>Is that a fair assessment of the situation ?

                        Considering all of this, I think the right path

                        forward is :<br>

                      </div>

                      <div> - Go for the scalar solution in the general

                        case.<br>

                      </div>

                      <div> - If that is a problem, the slicing approach

                        can be used for non atomic/volatile.<br>

                      </div>

                      <div> - If necessary, disable the transformation

                        for very large aggregates (and consider doing so

                        for SROA as well).<br>

                        <br>

                      </div>

                      <div>Do we have a plan ?<br>

                      </div>

                      <div><br>

                      </div>

                    </div>

                  </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">2015-08-18 18:36 GMT-07:00 Nicholas

          Chapman via llvm-dev <span dir="ltr"><<a

              moz-do-not-send="true"

              href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div bgcolor="#FFFFFF" text="#000000"> Oh,<br>

              and another potential reason for handling aggregate loads

              and stores directly is that it expresses the semantics of

              the program more clearly, which I think should allow LLVM

              to optimise more aggresively.<br>

              Here's a bug report showing a missed optimisation, which I

              think is due to the use of memcpy, which in turn is

              required to work around slow structure loads and stores:<br>

              <a moz-do-not-send="true"

                href="https://llvm.org/bugs/show_bug.cgi?id=23226"

                target="_blank">https://llvm.org/bugs/show_bug.cgi?id=23226</a><br>

              <br>

              Cheers,<br>

                Nick<span class=""><br>

                <div>On 17/08/2015 22:02, mats petersson via llvm-dev

                  wrote:<br>

                </div>

              </span>

              <div>

                <div class="h5">

                  <blockquote type="cite">

                    <div dir="ltr">

                      <div>

                        <div>

                          <div>

                            <div>

                              <div>

                                <div>I've definitely "run into this

                                  problem", and I would very much love

                                  to remove my kludges [that are

                                  incomplete, because I keep finding

                                  places where I need to modify the

                                  code-gen to "fix" the same problem -

                                  this is probably par for the course

                                  from a complete amateur compiler

                                  writer and someone that has only spent

                                  the last 14 months working (as a

                                  hobby) with LLVM]. <br>

                                  <br>

                                </div>

                                So whilst I can't contribute much on the

                                "what is the right solution" and "how do

                                we solve this", I would very much like

                                to see something that allows the user of

                                LLVM to use load/store withing things

                                like "is my thing that I'm storing big,

                                if so don't generate a load, use a

                                memcpy instead". Not only does this make

                                the usage of LLVM harder, it also causes

                                slow compilation [perhaps this is a

                                separte problem, but I have a simple

                                program that copies a large struct a few

                                times, and if I turn off my "use memcpy

                                for large things", the compile time gets

                                quite a lot longer - approx 1000x, and

                                48 seconds is a long time to compile 37

                                lines of relatively straight forward

                                code - even the Pascal compiler on

                                PDP-11/70 that I used at my school in

                                1980's was capable of doing more than 1

                                line per second, and it didn't run

                                anywhere near 2.5GHz and had 20-30 users

                                anytime I could use it...]<br>

                                <br>

                                ../lacsap -no-memcpy -tt longcompile.pas

                                <br>

                                Time for Parse 0.657 ms<br>

                                Time for Analyse 0.018 ms<br>

                                Time for Compile 1.248 ms<br>

                                Time for CreateObject 48803.263 ms<br>

                                Time for CreateBinary 48847.631 ms<br>

                                Time for Compile 48854.064 ms<br>

                                <br>

                              </div>

                              compared with:<br>

                              ../lacsap -tt longcompile.pas <br>

                              Time for Parse 0.455 ms<br>

                              Time for Analyse 0.013 ms<br>

                              Time for Compile 1.138 ms<br>

                              Time for CreateObject 44.627 ms<br>

                              Time for CreateBinary 82.758 ms<br>

                              Time for Compile 95.797 ms<br>

                              <br>

                            </div>

                            wc longcompile.pas <br>

                             37  84 410 longcompile.pas<br>

                            <br>

                          </div>

                          Source here:<br>

                          <a moz-do-not-send="true"

href="https://github.com/Leporacanthicus/lacsap/blob/master/test/longcompile.pas"

                            target="_blank">https://github.com/Leporacanthicus/lacsap/blob/master/test/longcompile.pas</a><br>

                          <br>

                        </div>

                        <br>

                        --<br>

                      </div>

                      Mats<br>

                    </div>

                    <div class="gmail_extra"><br>

                      <div class="gmail_quote">On 17 August 2015 at

                        21:18, deadal nix via llvm-dev <span dir="ltr"><<a

                            moz-do-not-send="true"

                            href="mailto:llvm-dev@lists.llvm.org"

                            target="_blank">llvm-dev@lists.llvm.org</a>></span>

                        wrote:<br>

                        <blockquote class="gmail_quote" style="margin:0

                          0 0 .8ex;border-left:1px #ccc

                          solid;padding-left:1ex">

                          <div dir="ltr">

                            <div>

                              <div>

                                <div>OK, what about that plan :<br>

                                  <br>

                                </div>

                                Slice the aggregate into a serie of

                                valid loads/stores for non atomic ones.<br>

                              </div>

                              Use big scalar for atomic/volatile ones.<br>

                            </div>

                            Try to generate memcpy or memmove when

                            possible ?<br>

                            <div><br>

                            </div>

                          </div>

                          <div>

                            <div>

                              <div class="gmail_extra"><br>

                                <div class="gmail_quote">2015-08-17

                                  12:16 GMT-07:00 deadal nix <span

                                    dir="ltr"><<a

                                      moz-do-not-send="true"

                                      href="mailto:deadalnix@gmail.com"

                                      target="_blank">deadalnix@gmail.com</a>></span>:<br>

                                  <blockquote class="gmail_quote"

                                    style="margin:0 0 0

                                    .8ex;border-left:1px #ccc

                                    solid;padding-left:1ex">

                                    <div dir="ltr"><br>

                                      <div class="gmail_extra"><br>

                                        <div class="gmail_quote"><span>2015-08-17

                                            11:26 GMT-07:00 Mehdi Amini

                                            <span dir="ltr"><<a

                                                moz-do-not-send="true"

                                                href="mailto:mehdi.amini@apple.com"

                                                target="_blank">mehdi.amini@apple.com</a>></span>:<br>

                                            <blockquote

                                              class="gmail_quote"

                                              style="margin:0 0 0

                                              .8ex;border-left:1px #ccc

                                              solid;padding-left:1ex">

                                              <div

                                                style="word-wrap:break-word">Hi,

                                                <div><br>

                                                  <div><span>

                                                      <blockquote

                                                        type="cite">

                                                        <div>On Aug 17,

                                                          2015, at 12:13

                                                          AM, deadal nix

                                                          via llvm-dev

                                                          <<a

                                                          moz-do-not-send="true"

href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>

                                                          wrote:</div>

                                                        <br>

                                                        <div>

                                                          <div dir="ltr"><br>

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">2015-08-16

                                                          23:21

                                                          GMT-07:00

                                                          David Majnemer

                                                          <span

                                                          dir="ltr"><<a

moz-do-not-send="true" href="mailto:david.majnemer@gmail.com"

                                                          target="_blank">david.majnemer@gmail.com</a>></span>:<br>

                                                          <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0

                                                          0 0

                                                          .8ex;border-left:1px

                                                          #ccc

                                                          solid;padding-left:1ex">

                                                          <div dir="ltr"><br>

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote"><span></span>

                                                          <div>Because a

                                                          solution which

                                                          doesn't

                                                          generalize is

                                                          not a very

                                                          powerful

                                                          solution. 

                                                          What happens

                                                          when somebody

                                                          says that they

                                                          want to use

                                                          atomics +

                                                          large

                                                          aggregate

                                                          loads and

                                                          stores? Give

                                                          them yet

                                                          another,

                                                          different

                                                          answer? That

                                                          would mean our

                                                          earlier, less

                                                          general

                                                          answer,

                                                          approach was

                                                          either a

                                                          bandaid (bad)

                                                          or the new

                                                          answer

                                                          requires a

                                                          parallel code

                                                          path in their

                                                          frontend

                                                          (worse).</div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          </div>

                                                          </div>

                                                        </div>

                                                      </blockquote>

                                                      <div><br>

                                                      </div>

                                                      <div><br>

                                                      </div>

                                                    </span>

                                                    <div>+1 with David’s

                                                      approach: making

                                                      thing

                                                      incrementally

                                                      better is fine *as

                                                      long as* the long

                                                      term direction is

                                                      identified. Small

                                                      incremental

                                                      changes that makes

                                                      things slightly

                                                      better in the

                                                      short term but

                                                      drives us away of

                                                      the long term

                                                      direction is not

                                                      good.</div>

                                                    <div><br>

                                                    </div>

                                                    <div>Don’t get me

                                                      wrong, I’m not

                                                      saying that the

                                                      current patch is

                                                      not good, just

                                                      that it does not

                                                      seem clear to me

                                                      that the long term

                                                      direction has been

                                                      identified, which

                                                      explain why some

                                                      can be nervous

                                                      about adding stuff

                                                      prematurely. </div>

                                                    <div>And I’m not for

                                                      the status quo,

                                                      while I can’t

                                                      judge it

                                                      definitively

                                                      myself, I even

                                                      bugged David last

                                                      month to look at

                                                      this revision and

                                                      try to identify

                                                      what is really the

                                                      long term

                                                      direction and how

                                                      to make your (and

                                                      other) frontends’

                                                      life easier. </div>

                                                    <span>

                                                      <div><br>

                                                      </div>

                                                      <div><br>

                                                      </div>

                                                    </span></div>

                                                </div>

                                              </div>

                                            </blockquote>

                                            <div><br>

                                            </div>

                                          </span>

                                          <div>As long as there is

                                            something to be done.

                                            Concern has been raised for

                                            very large aggregate (64K,

                                            1Mb) but there is no way a

                                            good codegen can come out of

                                            these anyway. I don't know

                                            of any machine that have 1Mb

                                            of register available to

                                            tank the load. Even I we had

                                            a good way to handle it in

                                            InstCombine, the backend

                                            would have no capability to

                                            generate something nice for

                                            it anyway. Most aggregates

                                            are small and there is no

                                            good excuse to not do

                                            anything to handle them

                                            because someone could

                                            generate gigantic ones that

                                            won't map nicely to the

                                            hardware anyway.<br>

                                            <br>

                                          </div>

                                          <div>By that logic, SROA

                                            should not exists as one

                                            could generate gigantic

                                            aggregate as well (in fact,

                                            SROA fail pretty badly on

                                            large aggregates).<br>

                                            <br>

                                          </div>

                                          <div>The second concern raised

                                            is for atomic/volatile,

                                            which needs to be handled by

                                            the optimizer differently

                                            anyway, so is mostly

                                            irrelevant here.<br>

                                          </div>

                                        </div>

                                        <span>

                                          <div class="gmail_quote">

                                            <div> </div>

                                            <blockquote

                                              class="gmail_quote"

                                              style="margin:0 0 0

                                              .8ex;border-left:1px #ccc

                                              solid;padding-left:1ex">

                                              <div

                                                style="word-wrap:break-word">

                                                <div>

                                                  <div><span>

                                                      <blockquote

                                                        type="cite">

                                                        <div>

                                                          <div dir="ltr">

                                                          <div

                                                          class="gmail_extra">

                                                          <div

                                                          class="gmail_quote">

                                                          <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0

                                                          0 0

                                                          .8ex;border-left:1px

                                                          #ccc

                                                          solid;padding-left:1ex">

                                                          <div dir="ltr">

                                                          <div

                                                          class="gmail_extra">

                                                          <div

                                                          class="gmail_quote"><span>

                                                          <div> </div>

                                                          </span></div>

                                                          </div>

                                                          </div>

                                                          </blockquote>

                                                          <br>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra">clang

                                                          has many

                                                          developer

                                                          behind it,

                                                          some of them

                                                          paid to work

                                                          on it. That s

                                                          simply not the

                                                          case for many

                                                          others.<br>

                                                          <br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra">But

                                                          to answer your

                                                          questions :<br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra"> -

                                                          Per field

                                                          load/store

                                                          generate more

                                                          loads/stores

                                                          than necessary

                                                          in many cases.

                                                          These can't be

                                                          aggregated

                                                          back because

                                                          of padding.<br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra"> -

                                                          memcpy only

                                                          work memory to

                                                          memory. It is

                                                          certainly

                                                          usable in some

                                                          cases, but

                                                          certainly do

                                                          not cover all

                                                          uses.<br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra"><br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra">I'm

                                                          willing to do

                                                          the memcpy

                                                          optimization

                                                          in InstCombine

                                                          (in fact,

                                                          things would

                                                          not degenerate

                                                          into so much

                                                          bikescheding,

                                                          that would

                                                          already be

                                                          done).<br>

                                                          </div>

                                                          </div>

                                                        </div>

                                                      </blockquote>

                                                      <div><br>

                                                      </div>

                                                    </span></div>

                                                </div>

                                                <div>Calling out

                                                  “bikescheding” what

                                                  other devs think is

                                                  what keeps the quality

                                                  of the project high is

                                                  unlikely to help your

                                                  patch go through, it’s

                                                  probably quite the

                                                  opposite actually.</div>

                                                <div><br>

                                                </div>

                                                <div><br>

                                                </div>

                                              </div>

                                            </blockquote>

                                            <br>

                                          </div>

                                        </span>I understand the desire

                                        to keep quality high. That's is

                                        not where the problem is. The

                                        problem lies into discussing

                                        actual proposal against

                                        hypothetical perfect ones that

                                        do not exists.<br>

                                      </div>

                                      <div class="gmail_extra"><br>

                                      </div>

                                    </div>

                                  </blockquote>

                                </div>

                                <br>

                              </div>

                            </div>

                          </div>

                          <br>

_______________________________________________<br>

                          LLVM Developers mailing list<br>

                          <a moz-do-not-send="true"

                            href="mailto:llvm-dev@lists.llvm.org"

                            target="_blank">llvm-dev@lists.llvm.org</a> 

                                 <a moz-do-not-send="true"

                            href="http://llvm.cs.uiuc.edu"

                            rel="noreferrer" target="_blank">http://llvm.cs.uiuc.edu</a><br>

                          <a moz-do-not-send="true"

                            href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

                            rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

                          <br>

                        </blockquote>

                      </div>

                      <br>

                    </div>

                    <br>

                    <fieldset></fieldset>

                    <br>

                    <pre>_______________________________________________

LLVM Developers mailing list

<a moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>         <a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a>

<a moz-do-not-send="true" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

                  </blockquote>

                  <br>

                </div>

              </div>

            </div>

            <br>

            _______________________________________________<br>

            LLVM Developers mailing list<br>

            <a moz-do-not-send="true"

              href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

            <a moz-do-not-send="true"

              href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

              rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

            <br>

          </blockquote>

        </div>

        <br>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>