<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    I'm generally in support of (1), but you should definitely get
    active buy-in from Chandler before moving forward.  He's the most
    knowledgeable on the tradeoffs.<br>
    <br>
    Also, are you willing to commit to the fairly large amount of work
    (1) implies?  I strongly suspect that doing (1) right will be *far*
    more work in the short term than (2).  It may still be the right
    long term answer, but are you ready to see the entire thing
    through?  Getting half way through and stopping would be much worse
    than (2).  <br>
    <br>
    Philip<br>
    <br>
    <div class="moz-cite-prefix">On 03/22/2016 11:32 AM, Ehsan Amiri via
      llvm-dev wrote:<br>
    </div>
    <blockquote
cite="mid:CAJ6ibVc9izT2U1iUe9PNrBDWBmDHhj8YT6_iaOQbC_ZOKAG4Cg@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>
            <div>Back to the discussion on the RFC, I still see some
              advantage in following the proposed solution. I see two
              paths forward:<br>
              <br>
            </div>
            1- Change canonical form, possibly lower memcpy to
            non-integer load and store in InstCombine. Then teach the
            backends to convert that to integer load and store if that
            is more profitable. Notice that we are talking about loads
            that have no use other than store. So it is a fairly simple
            change in the backends.<br>
            <br>
          </div>
          2- Do not change the canonical form. Then for this bug, we
          need to teach select speculation to see through bitcasts. We
          will probably need to teach other optimizations to see though
          bitcasts in the future as problems are uncovered. That is
          until typeless pointer work is complete. Once the typeless
          pointer work is complete, we have some extra code in each
          optimization for seeing through bitcasts which is possibly no
          longer needed.<br>
          <br>
        </div>
        <div>Based on this I think (1) is the right thing to do. But
          there might be other reasons for the current canonical form
          that I am not aware of. Please let me know what you think.<br>
          <br>
        </div>
        <div>Thanks<br>
        </div>
        Ehsan<br>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Wed, Mar 16, 2016 at 2:13 PM, David
          Blaikie <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr"><br>
              <div class="gmail_extra"><br>
                <div class="gmail_quote"><span class="">On Wed, Mar 16,
                    2016 at 11:00 AM, Ehsan Amiri <span dir="ltr"><<a
                        moz-do-not-send="true"
                        href="mailto:ehsanamiri@gmail.com"
                        target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:ehsanamiri@gmail.com">ehsanamiri@gmail.com</a></a>></span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div dir="ltr">
                        <div>
                          <div>
                            <div>David,<br>
                              <br>
                            </div>
                            Could you give us an update on the status of
                            typeless pointer work? How much work is left
                            and when you think it might be ready?<br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                    <div><br>
                    </div>
                  </span>
                  <div>It's a bit of an onion peel, really - since it
                    will eventually involve generalizing/fixing every
                    optimization that's currently leaning on typed
                    pointers to keep the performance while removing the
                    crutch they're currently leaning on. (in cases where
                    bitcasts are literally just getting in the way,
                    those won't require cleaning up & should just
                    become "free performance wins" once we remove them,
                    though)<br>
                    <br>
                    At the moment we can roundtrip every LLVM IR test
                    case through bitcode and textual IR (reading/writing
                    both formats) while using only a narrow whitelist of
                    places that request the type of a pointer (things
                    like the verifier, the parser/printer where it
                    actually needs the typed pointer to verify it
                    matches the explicit type, etc).<br>
                    <br>
                    The next thing on the list is probably figuring out
                    the byval/inalloca representation (add an explicit
                    pointee type? just make the number of bytes explicit
                    with no type information?).<br>
                    <br>
                    Then start migrating optimizations over - doing the
                    same sort of testing I did for the IR/bitcode
                    roundtripping - assert that the pointee type is not
                    accessed, whitelist places that need it until the
                    bitcasts go away, fix anything else... it'll still
                    be a fair bit of work & I don't really know how
                    much. It should parallelize pretty well (doing any
                    of this work is really helpful, each optimization is
                    indepednent, etc) if anyone wants to/is able to
                    help.<br>
                    <br>
                    - Dave</div>
                  <div>
                    <div class="h5">
                      <div> </div>
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">
                        <div dir="ltr">
                          <div>
                            <div><br>
                            </div>
                            Thanks<span><font color="#888888"><br>
                              </font></span></div>
                          <span><font color="#888888">Ehsan<br>
                              <br>
                            </font></span></div>
                        <div>
                          <div>
                            <div class="gmail_extra"><br>
                              <div class="gmail_quote">On Wed, Mar 16,
                                2016 at 1:12 PM, David Blaikie <span
                                  dir="ltr"><<a
                                    moz-do-not-send="true"
                                    href="mailto:dblaikie@gmail.com"
                                    target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a></a>></span>
                                wrote:<br>
                                <blockquote class="gmail_quote"
                                  style="margin:0 0 0
                                  .8ex;border-left:1px #ccc
                                  solid;padding-left:1ex">
                                  <div dir="ltr"><br>
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote"><span>On
                                          Wed, Mar 16, 2016 at 8:34 AM,
                                          Mehdi Amini via llvm-dev <span
                                            dir="ltr"><<a
                                              moz-do-not-send="true"
                                              href="mailto:llvm-dev@lists.llvm.org"
                                              target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a>></span>
                                          wrote:<br>
                                          <blockquote
                                            class="gmail_quote"
                                            style="margin:0 0 0
                                            .8ex;border-left:1px #ccc
                                            solid;padding-left:1ex">
                                            <div
                                              style="word-wrap:break-word">Hi,
                                              <div><br>
                                              </div>
                                              <div>How do it interact
                                                with the "typeless
                                                pointers" work?</div>
                                            </div>
                                          </blockquote>
                                          <div><br>
                                          </div>
                                        </span>
                                        <div>Right - the goal of the
                                          typeless pointer work is to
                                          fix all these bugs related to
                                          "didn't look through bitcasts"
                                          in optimizations. Sometimes
                                          that's going to mean more work
                                          (because the code is leaning
                                          on the absence of bitcasts
                                          & the presence of
                                          convenient (but not
                                          guaranteed) type information
                                          to inform optimization
                                          decisions) but if we remove
                                          typed pointers while keeping
                                          optimization quality in the
                                          cases we have today, then we
                                          should've also fixed the cases
                                          that were broken because the
                                          type information didn't end up
                                          aligning to produce the
                                          optimal output.<br>
                                          <br>
                                          & I know I've been off the
                                          typeless pointer stuff for a
                                          few months working on llvm-dwp
                                          - but happy for any help (the
                                          next immediate piece is
                                          probably figuring out teh
                                          right representation for byval
                                          and inalloca - there were some
                                          code reviews sent out for that
                                          that I'll need to come back
                                          around to - but also any
                                          optimizations people want to
                                          help rework/improve would be
                                          great too & I can provide
                                          some techniques/tools to help
                                          people approach those)<br>
                                          <br>
                                          - Dave</div>
                                        <div>
                                          <div>
                                            <div> </div>
                                            <blockquote
                                              class="gmail_quote"
                                              style="margin:0 0 0
                                              .8ex;border-left:1px #ccc
                                              solid;padding-left:1ex">
                                              <div
                                                style="word-wrap:break-word">
                                                <div><br>
                                                </div>
                                                <div>Thanks,</div>
                                                <div><br>
                                                </div>
                                                <div>-- </div>
                                                <div>Mehdi</div>
                                                <div><br>
                                                  <div>
                                                    <blockquote
                                                      type="cite">
                                                      <div>
                                                        <div>
                                                          <div>On Mar
                                                          16, 2016, at
                                                          6:41 AM, Ehsan
                                                          Amiri via
                                                          llvm-dev <<a
moz-do-not-send="true" href="mailto:llvm-dev@lists.llvm.org"
                                                          target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a>>
                                                          wrote:</div>
                                                          <br>
                                                        </div>
                                                      </div>
                                                      <div>
                                                        <div>
                                                          <div>
                                                          <div dir="ltr">===
                                                          PROBLEM ===
                                                          (See this bug
                                                          <a
                                                          moz-do-not-send="true"
href="https://llvm.org/bugs/show_bug.cgi?id=26445" target="_blank"><a class="moz-txt-link-freetext" href="https://llvm.org/bugs/show_bug.cgi?id=26445">https://llvm.org/bugs/show_bug.cgi?id=26445</a></a>)<br>
                                                          <br>
                                                          IR contains
                                                          code for
                                                          loading a
                                                          float from
                                                          float * and
                                                          storing it to
                                                          a float *
                                                          address. After
                                                          canonicalization
                                                          of load in
                                                          InstCombine
                                                          [1], new
                                                          bitcasts are
                                                          added to the
                                                          IR (see bottom
                                                          of the email
                                                          for code
                                                          samples). This
                                                          prevents
                                                          select
                                                          speculation in
                                                          SROA to work.
                                                          Also after
                                                          SROA we have
                                                          bitcasts from
                                                          int32 to
                                                          float.
                                                          (Whereas
                                                          originally
                                                          after
                                                          instCombine,
                                                          bitcasts are
                                                          only done on
                                                          pointer
                                                          types).<br>
                                                          <br>
                                                          === PROPOSED
                                                          SOLUTION===<br>
                                                          <br>
                                                          [1] implies
                                                          that we need
                                                          load
                                                          canonicalization
                                                          when we load a
                                                          value only to
                                                          store it
                                                          again. The
                                                          reason is to
                                                          avoid
                                                          generating
                                                          slightly
                                                          different code
                                                          (due to
                                                          different ways
                                                          of adding
                                                          bitcasts), in
                                                          different
                                                          situations. In
                                                          all examples
                                                          presented in
                                                          [1] there is a
                                                          non-zero
                                                          number of
                                                          bitcasts. I
                                                          think when we
                                                          load a value
                                                          of type T from
                                                          a T* address
                                                          and store it
                                                          as a type T
                                                          value to one
                                                          or more T*
                                                          address (and
                                                          there is no
                                                          other use or
                                                          store), we can
                                                          redefine
                                                          canonical form
                                                          to mean there
                                                          should not be
                                                          any bitcasts.
                                                          So we still
                                                          have a
                                                          canonical
                                                          form, but its
                                                          definition is
                                                          slightly
                                                          different.<br>
                                                          <br>
                                                          === REASONS
                                                          FOR /
                                                          AGAINST===<br>
                                                          <br>
                                                          Hal Finkel
                                                          warns that
                                                          while this may
                                                          be useful for
                                                          power pc, this
                                                          may hurt more
                                                          than one other
                                                          platform and
                                                          become a very
                                                          large project.
                                                          Despite this
                                                          he is fine
                                                          with bringing
                                                          up the issue
                                                          to the mailing
                                                          list to get
                                                          feedback,
                                                          mostly because
                                                          this seems
                                                          inline with
                                                          our future
                                                          direction of
                                                          having a
                                                          unique type
                                                          for all
                                                          pointers. 
                                                          (Hal please
                                                          correct me if
                                                          I
                                                          misunderstood
                                                          your comment)<br>
                                                          <br>
                                                          This is a much
                                                          simpler fix
                                                          compared to
                                                          alternatives.
                                                          (ignoring
                                                          potential
                                                          regressions)<br>
                                                          <br>
                                                          ===
                                                          ALTERNATIVE
                                                          SOLUTION ===<br>
                                                          <br>
                                                          Fix select
                                                          speculation in
                                                          SROA to see
                                                          through
                                                          bitcasts.
                                                          Handle
                                                          remaining
                                                          bitcasts
                                                          during code
                                                          gen. Other
                                                          alternative
                                                          solutions are
                                                          welcome.<br>
                                                          <br>
                                                          Should I
                                                          implement the
                                                          proposed
                                                          solution or is
                                                          it too risky?
                                                          I understand
                                                          that we may
                                                          need to undo
                                                          it if it
                                                          breaks too
                                                          many things.
                                                          Comments are
                                                          welcome.<br>
                                                          <br>
                                                          <br>
                                                          [1] <a
                                                          moz-do-not-send="true"
href="http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html"
target="_blank"><a class="moz-txt-link-freetext" href="http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html">http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html</a></a> 
                                                          r226781  git
                                                          commit id:
                                                          b778cbc0c8<br>
                                                          <br>
                                                          <br>
                                                          <br>
                                                          Code Samples
                                                          (only relevant
                                                          part is
                                                          copied):<br>
                                                          <br>
                                                          -------------------- 
                                                          Before
                                                          Canonicalization
                                                          (contains call
                                                          to std::max):
                                                          -------------------- 
                                                          <br>
                                                          entry:<br>
                                                            %max_value =
                                                          alloca float,
                                                          align 4<br>
                                                            %1 = load
                                                          float, float*
                                                          %input, align
                                                          4, !tbaa !1<br>
                                                            store float
                                                          %1, float*
                                                          %max_value,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          for.body:<br>
                                                            %call = call
                                                          dereferenceable(4)
                                                          float*
                                                          @_ZSt3maxIfERKT_S2_S2_(float*
                                                          dereferenceable(4)
                                                          %max_value,
                                                          float*
                                                          dereferenceable(4)
                                                          %arrayidx1)<br>
                                                            %3 = load
                                                          float, float*
                                                          %call, align
                                                          4, !tbaa !1<br>
                                                            store float
                                                          %3, float*
                                                          %max_value,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          -------------------- 
                                                          After
                                                          Canonicalization
                                                          (contains call
                                                          to
                                                          std::max):-------------------- 
                                                          <br>
                                                          <br>
                                                          entry:<br>
                                                            %max_value =
                                                          alloca float,
                                                          align 4<br>
                                                            %1 = bitcast
                                                          float* %input
                                                          to i32*<br>
                                                            %2 = load
                                                          i32, i32* %1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %3 = bitcast
                                                          float*
                                                          %max_value to
                                                          i32*<br>
                                                            store i32
                                                          %2, i32* %3,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          for.body:<br>
                                                            %call = call
                                                          dereferenceable(4)
                                                          float*
                                                          @_ZSt3maxIfERKT_S2_S2_(float*
                                                          nonnull
                                                          dereferenceable(4)
                                                          %max_value,
                                                          float*
                                                          dereferenceable(4)
                                                          %arrayidx1)<br>
                                                            %5 = bitcast
                                                          float* %call
                                                          to i32*<br>
                                                            %6 = load
                                                          i32, i32* %5,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %7 = bitcast
                                                          float*
                                                          %max_value to
                                                          i32*<br>
                                                            store i32
                                                          %6, i32* %7,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          --------------------
                                                          After SROA
                                                          (the call to
                                                          std::max is
                                                          inlined
                                                          now):-------------------- 
                                                          <br>
                                                          entry:<br>
                                                           
                                                          %max_value.sroa.0
                                                          = alloca i32<br>
                                                            %0 = bitcast
                                                          float* %input
                                                          to i32*<br>
                                                            %1 = load
                                                          i32, i32* %0,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            store i32
                                                          %1, i32*
                                                          %max_value.sroa.0<br>
                                                          <br>
                                                          for.body: <br>
                                                           
                                                          %max_value.sroa.0.0.max_value.sroa.0.0.6
                                                          = load i32,
                                                          i32*
                                                          %max_value.sroa.0<br>
                                                            %3 = bitcast
                                                          i32
                                                          %max_value.sroa.0.0.max_value.sroa.0.0.6
                                                          to float<br>
                                                           
                                                          %max_value.sroa.0.0.max_value.sroa_cast8
                                                          = bitcast i32*
                                                          %max_value.sroa.0
                                                          to float*<br>
                                                            %__b.__a.i =
                                                          select i1
                                                          %cmp.i, float*
                                                          %arrayidx1,
                                                          float*
                                                          %max_value.sroa.0.0.max_value.sroa_cast8<br>
                                                            %5 = bitcast
                                                          float*
                                                          %__b.__a.i to
                                                          i32*<br>
                                                            %6 = load
                                                          i32, i32* %5,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            store i32
                                                          %6, i32*
                                                          %max_value.sroa.0<br>
                                                          <br>
                                                          --------------------
                                                          After SROA
                                                          when
                                                          Canonicalization
                                                          is turned
                                                          off-------------------- 
                                                          <br>
                                                          entry:<br>
                                                            %0 = load
                                                          float, float*
                                                          %input, align
                                                          4, !tbaa !1<br>
                                                          <br>
                                                          for.cond:                                        
                                                          ; preds =
                                                          %for.body,
                                                          %entry<br>
                                                            %max_value.0
                                                          = phi float [
                                                          %0, %entry ],
                                                          [
                                                          %.sroa.speculated,
                                                          %for.body ]<br>
                                                          <br>
                                                          for.body:<br>
                                                            %1 = load
                                                          float, float*
                                                          %arrayidx1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %cmp.i =
                                                          fcmp olt float
                                                          %max_value.0,
                                                          %1<br>
                                                           
                                                          %.sroa.speculate.load.true
                                                          = load float,
                                                          float*
                                                          %arrayidx1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                           
                                                          %.sroa.speculated
                                                          = select i1
                                                          %cmp.i, float
                                                          %.sroa.speculate.load.true,
                                                          float
                                                          %max_value.0<br>
                                                          </div>
                                                          </div>
                                                        </div>
_______________________________________________<br>
                                                        LLVM Developers
                                                        mailing list<br>
                                                        <a
                                                          moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a></a><br>
                                                        <a
                                                          moz-do-not-send="true"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                                                          target="_blank"><a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></a><br>
                                                      </div>
                                                    </blockquote>
                                                  </div>
                                                  <br>
                                                </div>
                                              </div>
                                              <br>
_______________________________________________<br>
                                              LLVM Developers mailing
                                              list<br>
                                              <a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
                                              <a moz-do-not-send="true"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
                                                rel="noreferrer"
                                                target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                                              <br>
                                            </blockquote>
                                          </div>
                                        </div>
                                      </div>
                                      <br>
                                    </div>
                                  </div>
                                </blockquote>
                              </div>
                              <br>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                  </div>
                </div>
                <br>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>