<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 03/25/2014 06:21 PM, Matt Arsenault

      wrote:<br>

    </div>

    <blockquote cite="mid:53320153.9080106@amd.com" type="cite">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">On 03/25/2014 02:31 PM, Jingyue Wu

        wrote:<br>

      </div>

      <blockquote

cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"

        type="cite">

        <meta http-equiv="Content-Type" content="text/html;

          charset=ISO-8859-1">

        <div dir="ltr">

          <div class="gmail_quote">

            <div dir="ltr"><br>

              <div>However, we have three concerns on this:</div>

              <div>a) I doubt this optimization is valid for all

                targets, because LLVM language reference (<a

                  moz-do-not-send="true"

                  href="http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction"

                  target="_blank">http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction</a>)

                says addrspacecast "can be a no-op cast or a complex

                value modification, depending on the target and the

                address space pair." <br>

              </div>

            </div>

          </div>

        </div>

      </blockquote>

      I think most of the simple cast optimizations would be acceptable.

      The addrspacecasted pointer still needs to point to the same

      memory location, so changing an access to use a different address

      space would be OK. I think canonicalizing accesses to use the

      original address space of a casted pointer when possible would

      make sense.<br>

      <br>

      <blockquote

cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_quote">

            <div dir="ltr">

              <div>b) NVPTX and R600 have different address numbering

                for the generic address space, which makes things more

                complicated. </div>

              <div>c) We don't have a good understanding of the R600

                backend. </div>

              <br>

            </div>

          </div>

        </div>

      </blockquote>

      <br>

      R600 currently does not support the flat address space

      instructions intended to use for the generic address space. I

      posted a patch a while ago that half added it, which I can try to

      work on finishing if it would help.<br>

      <br>

      I also do not understand how NVPTX uses address spaces,

      particularly how it can use 0 as the the generic address space.<br>

    </blockquote>

    <br>

    We handle alloca by expanding it to a local stack reservation plus a

    pointer conversion to the generic address space.  So if we have IR

    like the following:<br>

    <tt><br>

    </tt><tt>%ptr = alloca i32</tt><tt><br>

    </tt><tt>store i32 0, i32* %ptr</tt><br>

    <br>

    This will really get expanded to something like the following at

    MachineInstr-level (in pseudo-code):<br>

    <br>

    <tt>%local_ptr = %SP+offset    ; Stack pointer (in thread-local

      [private] address space)</tt><tt><br>

    </tt><tt>%ptr = convert %local_ptr to generic address</tt><tt><br>

    </tt><tt>store.generic.i32 [%ptr], 0</tt><br>

    <br>

    With the proposed optimization, this would be optimized back to a

    non-generic store:<br>

    <tt><br>

    </tt><tt>%local_ptr = %SP+offset</tt><tt><br>

    </tt><tt>%ptr = convert %local_ptr to generic address</tt><tt><br>

    </tt><tt>%ptr.0 = convert %ptr to thread-local address space</tt><tt><br>

    </tt><tt>store.local.i32 [%ptr.0], 0</tt><br>

    <br>

    This turns the address space conversion sequence into a no-op

    (assuming no other users) that can be eliminated, and a non-generic

    store is likely to be more efficient than a generic store.<br>

    <br>

    <blockquote cite="mid:53320153.9080106@amd.com" type="cite"> <br>

      <blockquote

cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div class="gmail_quote">

            <div dir="ltr">

              <div>2. How effective do we want this optimization to be? </div>

              <div><br>

              </div>

              <div>In the short term, I want it to be able to eliminate

                unnecessary non-generic-to-generic addrspacecasts the

                front-end generates for the NVPTX target. For example, <br>

              </div>

              <div><br>

              </div>

              <div>%p1 = addrspace i32 addrspace(3)* %p0 to i32*</div>

              <div>%v = load i32* %p1</div>

              <div><br>

              </div>

              <div>=></div>

              <div><br>

              </div>

              <div>%v = load i32 addrspace(3)* %p0</div>

              <div><br>

              </div>

              <div>We want similar optimization for store+addrspacecast

                and gep+addrspacecast as well. </div>

              <div><br>

              </div>

              <div>In a long term, we could for sure improve this

                optimization to handle more instructions and more

                patterns. </div>

              <span></span><br>

            </div>

          </div>

        </div>

      </blockquote>

      I believe most of the cast simplifications that apply to bitcasts

      of pointers also apply to addrspacecast. I have some patches

      waiting that extend some of the more basic ones to understand

      addrspacecast (e.g. <a moz-do-not-send="true"

        class="moz-txt-link-freetext"

href="http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html">http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html</a>),

      plus a few more that I haven't posted yet. Mostly they are little

      cast simplifications like your example in instcombine, but also

      SROA to eliminate allocas that are addrspacecasted.<br>

      <br>

      -Matt<br>

    </blockquote>

    <br>

<DIV>

<HR>

</DIV>

<DIV>This email message is for the sole use of the intended recipient(s) and may 

contain confidential information.  Any unauthorized review, use, disclosure 

or distribution is prohibited.  If you are not the intended recipient, 

please contact the sender by reply email and destroy all copies of the original 

message. </DIV>

<DIV>

<HR>

</DIV>

</body>

</html>