<div dir="ltr">Ultimately everything is going to be made to not rely on the types of pointers - that's nearly equivalent to bitcast-ignorant (the difference being that the presence of an extra instruction (the bitcast) might trip up some optimizations - but the presence of the /type/ information implied by the bitcast should not trip up or be necessary for optimizations (two sides of the same coin))<br><br>If we're talking about making an optimization able to ignore the bitcast instructions - yes, that work is unnecessary & perhaps questionable given the typeless pointer work. Not outright off limits, but the same time might be better invested in moving typeless pointers forward if the contributor is so inclined/able to shift in that direction.<br><br>But if we're talking about the work to make the optimization not use the type of pointers as a crutch - that work is a necessary precursor to the typeless pointer work and would be most welcome.<br><br>- David</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 22, 2016 at 12:39 PM, Mehdi Amini <span dir="ltr"><<a href="mailto:mehdi.amini@apple.com" target="_blank">mehdi.amini@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">I don't really mind, but the intermediate stage will not be very nice: that a lot of code / tests that needs to be written with bitcast, and all of that while they are deemed to disappear. The added value isn't clear to me considering the added work. I'm not sure it wouldn't add more work for all the cleanup required by the "typeless pointer", but I'm not sure what's involved here and if David thinks the intermediate steps of handling bit casts everywhere is not making it harder I'm fine with it.<span class="HOEnZb"><font color="#888888"><div><br></div><div>-- </div><div>Mehdi</div></font></span><div><div class="h5"><div><br><div><blockquote type="cite"><div>On Mar 22, 2016, at 12:36 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>> wrote:</div><br><div>
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    I'd phrase this differently: being pointer-bitcast agnostic is a
    step towards support typeless pointers.  :)  We can either become
    bitcast agnostic all in one big change or incrementally. 
    Personally, I'd prefer the later since it reduces the risk
    associated with enabling typeless pointers in the end.<br>
    <br>
    Philip<br>
    <br>
    <div>On 03/22/2016 12:16 PM, Mehdi Amini via
      llvm-dev wrote:<br>
    </div>
    <blockquote type="cite">
      
      I don't know enough about the tradeoff for 1, but 2 seems like a
      bandaid for something that is not a correctness issue neither a
      regression. I'm not sure it justifies "bandaid patches" while
      there is a clear path forward, i.e. typeless pointers, unless
      there is an acknowledgement that typeless pointers won't be there
      before a couple of years.
      <div>
        <div><br>
        </div>
        <div>-- </div>
        <div>Mehdi</div>
        <div><br>
          <div>
            <div>
              <blockquote type="cite">
                <div>On Mar 22, 2016, at 11:32 AM, Ehsan Amiri
                  <<a href="mailto:ehsanamiri@gmail.com" target="_blank">ehsanamiri@gmail.com</a>>
                  wrote:</div>
                <br>
                <div>
                  <div dir="ltr">
                    <div>
                      <div>
                        <div>Back to the discussion on the RFC,
                          I still see some advantage in following the
                          proposed solution. I see two paths forward:<br>
                          <br>
                        </div>
                        1- Change canonical form, possibly lower memcpy
                        to non-integer load and store in InstCombine.
                        Then teach the backends to convert that to
                        integer load and store if that is more
                        profitable. Notice that we are talking about
                        loads that have no use other than store. So it
                        is a fairly simple change in the backends.<br>
                        <br>
                      </div>
                      2- Do not change the canonical form. Then for this
                      bug, we need to teach select speculation to see
                      through bitcasts. We will probably need to teach
                      other optimizations to see though bitcasts in the
                      future as problems are uncovered. That is until
                      typeless pointer work is complete. Once the
                      typeless pointer work is complete, we have some
                      extra code in each optimization for seeing through
                      bitcasts which is possibly no longer needed.<br>
                      <br>
                    </div>
                    <div>Based on this I think (1) is the right
                      thing to do. But there might be other reasons for
                      the current canonical form that I am not aware of.
                      Please let me know what you think.<br>
                      <br>
                    </div>
                    <div>Thanks<br>
                    </div>
                    Ehsan<br>
                  </div>
                  <div class="gmail_extra"><br>
                    <div class="gmail_quote">On Wed, Mar 16, 2016 at
                      2:13 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank"></a><a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span>
                      wrote:<br>
                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                        <div dir="ltr"><br>
                          <div class="gmail_extra"><br>
                            <div class="gmail_quote"><span>On
                                Wed, Mar 16, 2016 at 11:00 AM, Ehsan
                                Amiri <span dir="ltr"><<a href="mailto:ehsanamiri@gmail.com" target="_blank"></a><a href="mailto:ehsanamiri@gmail.com" target="_blank">ehsanamiri@gmail.com</a>></span>
                                wrote:<br>
                                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                  <div dir="ltr">
                                    <div>
                                      <div>
                                        <div>David,<br>
                                          <br>
                                        </div>
                                        Could you give us an update on
                                        the status of typeless pointer
                                        work? How much work is left and
                                        when you think it might be
                                        ready?<br>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <div><br>
                                </div>
                              </span>
                              <div>It's a bit of an onion peel,
                                really - since it will eventually
                                involve generalizing/fixing every
                                optimization that's currently leaning on
                                typed pointers to keep the performance
                                while removing the crutch they're
                                currently leaning on. (in cases where
                                bitcasts are literally just getting in
                                the way, those won't require cleaning up
                                & should just become "free
                                performance wins" once we remove them,
                                though)<br>
                                <br>
                                At the moment we can roundtrip every
                                LLVM IR test case through bitcode and
                                textual IR (reading/writing both
                                formats) while using only a narrow
                                whitelist of places that request the
                                type of a pointer (things like the
                                verifier, the parser/printer where it
                                actually needs the typed pointer to
                                verify it matches the explicit type,
                                etc).<br>
                                <br>
                                The next thing on the list is probably
                                figuring out the byval/inalloca
                                representation (add an explicit pointee
                                type? just make the number of bytes
                                explicit with no type information?).<br>
                                <br>
                                Then start migrating optimizations over
                                - doing the same sort of testing I did
                                for the IR/bitcode roundtripping -
                                assert that the pointee type is not
                                accessed, whitelist places that need it
                                until the bitcasts go away, fix anything
                                else... it'll still be a fair bit of
                                work & I don't really know how much.
                                It should parallelize pretty well (doing
                                any of this work is really helpful, each
                                optimization is indepednent, etc) if
                                anyone wants to/is able to help.<br>
                                <br>
                                - Dave</div>
                              <div>
                                <div>
                                  <div> </div>
                                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                    <div dir="ltr">
                                      <div>
                                        <div><br>
                                        </div>
                                        Thanks<span><font color="#888888"><br>
                                          </font></span></div>
                                      <span><font color="#888888">Ehsan<br>
                                          <br>
                                        </font></span></div>
                                    <div>
                                      <div>
                                        <div class="gmail_extra"><br>
                                          <div class="gmail_quote">On
                                            Wed, Mar 16, 2016 at 1:12
                                            PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank"></a><a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span>
                                            wrote:<br>
                                            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                              <div dir="ltr"><br>
                                                <div class="gmail_extra"><br>
                                                  <div class="gmail_quote"><span>On Wed,
                                                      Mar 16, 2016 at
                                                      8:34 AM, Mehdi
                                                      Amini via llvm-dev
                                                      <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
                                                      wrote:<br>
                                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                        <div style="word-wrap:break-word">Hi,
                                                          <div><br>
                                                          </div>
                                                          <div>How
                                                          do it interact
                                                          with the
                                                          "typeless
                                                          pointers"
                                                          work?</div>
                                                        </div>
                                                      </blockquote>
                                                      <div><br>
                                                      </div>
                                                    </span>
                                                    <div>Right
                                                      - the goal of the
                                                      typeless pointer
                                                      work is to fix all
                                                      these bugs related
                                                      to "didn't look
                                                      through bitcasts"
                                                      in optimizations.
                                                      Sometimes that's
                                                      going to mean more
                                                      work (because the
                                                      code is leaning on
                                                      the absence of
                                                      bitcasts & the
                                                      presence of
                                                      convenient (but
                                                      not guaranteed)
                                                      type information
                                                      to inform
                                                      optimization
                                                      decisions) but if
                                                      we remove typed
                                                      pointers while
                                                      keeping
                                                      optimization
                                                      quality in the
                                                      cases we have
                                                      today, then we
                                                      should've also
                                                      fixed the cases
                                                      that were broken
                                                      because the type
                                                      information didn't
                                                      end up aligning to
                                                      produce the
                                                      optimal output.<br>
                                                      <br>
                                                      & I know I've
                                                      been off the
                                                      typeless pointer
                                                      stuff for a few
                                                      months working on
                                                      llvm-dwp - but
                                                      happy for any help
                                                      (the next
                                                      immediate piece is
                                                      probably figuring
                                                      out teh right
                                                      representation for
                                                      byval and inalloca
                                                      - there were some
                                                      code reviews sent
                                                      out for that that
                                                      I'll need to come
                                                      back around to -
                                                      but also any
                                                      optimizations
                                                      people want to
                                                      help
                                                      rework/improve
                                                      would be great too
                                                      & I can
                                                      provide some
                                                      techniques/tools
                                                      to help people
                                                      approach those)<br>
                                                      <br>
                                                      - Dave</div>
                                                    <div>
                                                      <div>
                                                        <div> </div>
                                                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div style="word-wrap:break-word">
                                                          <div><br>
                                                          </div>
                                                          <div>Thanks,</div>
                                                          <div><br>
                                                          </div>
                                                          <div>-- </div>
                                                          <div>Mehdi</div>
                                                          <div><br>
                                                          <div>
                                                          <blockquote type="cite">
                                                          <div>
                                                          <div>
                                                          <div>On
                                                          Mar 16, 2016,
                                                          at 6:41 AM,
                                                          Ehsan Amiri
                                                          via llvm-dev
                                                          <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
                                                          wrote:</div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div dir="ltr">===
                                                          PROBLEM ===
                                                          (See this bug
                                                          <a href="https://llvm.org/bugs/show_bug.cgi?id=26445" target="_blank"></a><a href="https://llvm.org/bugs/show_bug.cgi?id=26445" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=26445</a>)<br>
                                                          <br>
                                                          IR contains
                                                          code for
                                                          loading a
                                                          float from
                                                          float * and
                                                          storing it to
                                                          a float *
                                                          address. After
                                                          canonicalization
                                                          of load in
                                                          InstCombine
                                                          [1], new
                                                          bitcasts are
                                                          added to the
                                                          IR (see bottom
                                                          of the email
                                                          for code
                                                          samples). This
                                                          prevents
                                                          select
                                                          speculation in
                                                          SROA to work.
                                                          Also after
                                                          SROA we have
                                                          bitcasts from
                                                          int32 to
                                                          float.
                                                          (Whereas
                                                          originally
                                                          after
                                                          instCombine,
                                                          bitcasts are
                                                          only done on
                                                          pointer
                                                          types).<br>
                                                          <br>
                                                          === PROPOSED
                                                          SOLUTION===<br>
                                                          <br>
                                                          [1] implies
                                                          that we need
                                                          load
                                                          canonicalization
                                                          when we load a
                                                          value only to
                                                          store it
                                                          again. The
                                                          reason is to
                                                          avoid
                                                          generating
                                                          slightly
                                                          different code
                                                          (due to
                                                          different ways
                                                          of adding
                                                          bitcasts), in
                                                          different
                                                          situations. In
                                                          all examples
                                                          presented in
                                                          [1] there is a
                                                          non-zero
                                                          number of
                                                          bitcasts. I
                                                          think when we
                                                          load a value
                                                          of type T from
                                                          a T* address
                                                          and store it
                                                          as a type T
                                                          value to one
                                                          or more T*
                                                          address (and
                                                          there is no
                                                          other use or
                                                          store), we can
                                                          redefine
                                                          canonical form
                                                          to mean there
                                                          should not be
                                                          any bitcasts.
                                                          So we still
                                                          have a
                                                          canonical
                                                          form, but its
                                                          definition is
                                                          slightly
                                                          different.<br>
                                                          <br>
                                                          === REASONS
                                                          FOR /
                                                          AGAINST===<br>
                                                          <br>
                                                          Hal Finkel
                                                          warns that
                                                          while this may
                                                          be useful for
                                                          power pc, this
                                                          may hurt more
                                                          than one other
                                                          platform and
                                                          become a very
                                                          large project.
                                                          Despite this
                                                          he is fine
                                                          with bringing
                                                          up the issue
                                                          to the mailing
                                                          list to get
                                                          feedback,
                                                          mostly because
                                                          this seems
                                                          inline with
                                                          our future
                                                          direction of
                                                          having a
                                                          unique type
                                                          for all
                                                          pointers. 
                                                          (Hal please
                                                          correct me if
                                                          I
                                                          misunderstood
                                                          your comment)<br>
                                                          <br>
                                                          This is a much
                                                          simpler fix
                                                          compared to
                                                          alternatives.
                                                          (ignoring
                                                          potential
                                                          regressions)<br>
                                                          <br>
                                                          ===
                                                          ALTERNATIVE
                                                          SOLUTION ===<br>
                                                          <br>
                                                          Fix select
                                                          speculation in
                                                          SROA to see
                                                          through
                                                          bitcasts.
                                                          Handle
                                                          remaining
                                                          bitcasts
                                                          during code
                                                          gen. Other
                                                          alternative
                                                          solutions are
                                                          welcome.<br>
                                                          <br>
                                                          Should I
                                                          implement the
                                                          proposed
                                                          solution or is
                                                          it too risky?
                                                          I understand
                                                          that we may
                                                          need to undo
                                                          it if it
                                                          breaks too
                                                          many things.
                                                          Comments are
                                                          welcome.<br>
                                                          <br>
                                                          <br>
                                                          [1] <a href="http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html" target="_blank"></a><a href="http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html" target="_blank">http://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html</a> 
                                                          r226781  git
                                                          commit id:
                                                          b778cbc0c8<br>
                                                          <br>
                                                          <br>
                                                          <br>
                                                          Code Samples
                                                          (only relevant
                                                          part is
                                                          copied):<br>
                                                          <br>
                                                          -------------------- 
                                                          Before
                                                          Canonicalization
                                                          (contains call
                                                          to std::max):
                                                          -------------------- 
                                                          <br>
                                                          entry:<br>
                                                            %max_value =
                                                          alloca float,
                                                          align 4<br>
                                                            %1 = load
                                                          float, float*
                                                          %input, align
                                                          4, !tbaa !1<br>
                                                            store float
                                                          %1, float*
                                                          %max_value,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          for.body:<br>
                                                            %call = call
                                                          dereferenceable(4)
                                                          float*
                                                          @_ZSt3maxIfERKT_S2_S2_(float*
                                                          dereferenceable(4)
                                                          %max_value,
                                                          float*
                                                          dereferenceable(4)
                                                          %arrayidx1)<br>
                                                            %3 = load
                                                          float, float*
                                                          %call, align
                                                          4, !tbaa !1<br>
                                                            store float
                                                          %3, float*
                                                          %max_value,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          -------------------- 
                                                          After
                                                          Canonicalization
                                                          (contains call
                                                          to
                                                          std::max):-------------------- 
                                                          <br>
                                                          <br>
                                                          entry:<br>
                                                            %max_value =
                                                          alloca float,
                                                          align 4<br>
                                                            %1 = bitcast
                                                          float* %input
                                                          to i32*<br>
                                                            %2 = load
                                                          i32, i32* %1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %3 = bitcast
                                                          float*
                                                          %max_value to
                                                          i32*<br>
                                                            store i32
                                                          %2, i32* %3,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          for.body:<br>
                                                            %call = call
                                                          dereferenceable(4)
                                                          float*
                                                          @_ZSt3maxIfERKT_S2_S2_(float*
                                                          nonnull
                                                          dereferenceable(4)
                                                          %max_value,
                                                          float*
                                                          dereferenceable(4)
                                                          %arrayidx1)<br>
                                                            %5 = bitcast
                                                          float* %call
                                                          to i32*<br>
                                                            %6 = load
                                                          i32, i32* %5,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %7 = bitcast
                                                          float*
                                                          %max_value to
                                                          i32*<br>
                                                            store i32
                                                          %6, i32* %7,
                                                          align 4, !tbaa
                                                          !1<br>
                                                          <br>
                                                          --------------------
                                                          After SROA
                                                          (the call to
                                                          std::max is
                                                          inlined
                                                          now):-------------------- 
                                                          <br>
                                                          entry:<br>
                                                           
                                                          %max_value.sroa.0
                                                          = alloca i32<br>
                                                            %0 = bitcast
                                                          float* %input
                                                          to i32*<br>
                                                            %1 = load
                                                          i32, i32* %0,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            store i32
                                                          %1, i32*
                                                          %max_value.sroa.0<br>
                                                          <br>
                                                          for.body: <br>
                                                           
                                                          %max_value.sroa.0.0.max_value.sroa.0.0.6
                                                          = load i32,
                                                          i32*
                                                          %max_value.sroa.0<br>
                                                            %3 = bitcast
                                                          i32
                                                          %max_value.sroa.0.0.max_value.sroa.0.0.6
                                                          to float<br>
                                                           
                                                          %max_value.sroa.0.0.max_value.sroa_cast8
                                                          = bitcast i32*
                                                          %max_value.sroa.0
                                                          to float*<br>
                                                            %__b.__a.i =
                                                          select i1
                                                          %cmp.i, float*
                                                          %arrayidx1,
                                                          float*
                                                          %max_value.sroa.0.0.max_value.sroa_cast8<br>
                                                            %5 = bitcast
                                                          float*
                                                          %__b.__a.i to
                                                          i32*<br>
                                                            %6 = load
                                                          i32, i32* %5,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            store i32
                                                          %6, i32*
                                                          %max_value.sroa.0<br>
                                                          <br>
                                                          --------------------
                                                          After SROA
                                                          when
                                                          Canonicalization
                                                          is turned
                                                          off-------------------- 
                                                          <br>
                                                          entry:<br>
                                                            %0 = load
                                                          float, float*
                                                          %input, align
                                                          4, !tbaa !1<br>
                                                          <br>
                                                          for.cond:                                        
                                                          ; preds =
                                                          %for.body,
                                                          %entry<br>
                                                            %max_value.0
                                                          = phi float [
                                                          %0, %entry ],
                                                          [
                                                          %.sroa.speculated,
                                                          %for.body ]<br>
                                                          <br>
                                                          for.body:<br>
                                                            %1 = load
                                                          float, float*
                                                          %arrayidx1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                            %cmp.i =
                                                          fcmp olt float
                                                          %max_value.0,
                                                          %1<br>
                                                           
                                                          %.sroa.speculate.load.true
                                                          = load float,
                                                          float*
                                                          %arrayidx1,
                                                          align 4, !tbaa
                                                          !1<br>
                                                           
                                                          %.sroa.speculated
                                                          = select i1
                                                          %cmp.i, float
                                                          %.sroa.speculate.load.true,
                                                          float
                                                          %max_value.0<br>
                                                          </div>
                                                          </div>
                                                          </div>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
                                                          <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          <br>
_______________________________________________<br>
                                                          LLVM
                                                          Developers
                                                          mailing list<br>
                                                          <a href="mailto:llvm-dev@lists.llvm.org" target="_blank"></a><a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
                                                          <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank"></a><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
                                                          <br>
                                                        </blockquote>
                                                      </div>
                                                    </div>
                                                  </div>
                                                  <br>
                                                </div>
                                              </div>
                                            </blockquote>
                                          </div>
                                          <br>
                                        </div>
                                      </div>
                                    </div>
                                  </blockquote>
                                </div>
                              </div>
                            </div>
                            <br>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                    <br>
                  </div>
                </div>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
LLVM Developers mailing list
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
    </blockquote>
    <br>
  </div>

</div></blockquote></div><br></div></div></div></div></blockquote></div><br></div>