<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    I think after reading your link I'm actually more confused.  This

    might just be a wording problem, but let me ask a couple of

    clarifying questions.<br>

    <br>

    1) After compiling the code sequence below (from that page), does

    the in memory bit pattern differ?  The page seemed to contradict

    itself.  <br>

    <pre>%0 = load <4 x i32> %x

%1 = bitcast <4 x i32> %0 to <2 x i64>

     store <2 x i64> %1, <2 x i64>* %y

</pre>

    2) If so, does this mean that performing dead-store-elimination is

    illegal for ARM?<br>

    <br>

    3) Are loads and stores ever allowed to fault based on the in memory

    representation?  <br>

    <br>

    4) What happens if we have a load of <2xi64> following the

    store above and we do DSE the store before forwarding it's value?<br>

    <br>

    Philip<br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 01/12/2016 05:55 AM, James Molloy

      via llvm-dev wrote:<br>

    </div>

    <blockquote

cite="mid:CALCTSA0Anj6jmk9Am+aKeEFPRHGeJoqkiinHO5eNnPDWZPpuVg@mail.gmail.com"

      type="cite">

      <div dir="ltr">Hi,

        <div><br>

        </div>

        <div>

          <div class="uyb8Gf">

            <div class="F3hlO">

              <div link="blue" vlink="purple" lang="EN-GB">

                <p class="MsoNormal"><span

                    style="font-size:11pt;font-family:Calibri,sans-serif">>

                    I found this thinking quite difficult to explain.

                    Does it make sense?</span></p>

                <div><span

                    style="font-size:11pt;font-family:Calibri,sans-serif">It

                    might help to link to the documentation on why

                    bitcasts are weird on big-endian NEON: </span><font

                    face="Calibri, sans-serif"><span

                      style="font-size:14.6667px;line-height:22px"><a

                        moz-do-not-send="true"

                        href="http://llvm.org/docs/BigEndianNEON.html#bitconverts"><a class="moz-txt-link-freetext" href="http://llvm.org/docs/BigEndianNEON.html#bitconverts">http://llvm.org/docs/BigEndianNEON.html#bitconverts</a></a></span></font></div>

                <div><font face="Calibri, sans-serif"><span

                      style="font-size:14.6667px;line-height:22px"><br>

                    </span></font></div>

                <div><font face="Calibri, sans-serif"><span

                      style="font-size:14.6667px;line-height:22px">Cheers,</span></font></div>

                <div><font face="Calibri, sans-serif"><span

                      style="font-size:14.6667px;line-height:22px"><br>

                    </span></font></div>

                <div><font face="Calibri, sans-serif"><span

                      style="font-size:14.6667px;line-height:22px">James</span></font></div>

              </div>

            </div>

          </div>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr">On Tue, 12 Jan 2016 at 13:23 Daniel Sanders via

          llvm-dev <<a moz-do-not-send="true"

            href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0 0 0

          .8ex;border-left:1px #ccc solid;padding-left:1ex">

          <div link="blue" vlink="purple" lang="EN-GB">

            <div>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Hi,</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                  haven't found much time to look into the LLVM-IR-level

                  optimizations yet so I'm not sure how they handle

                  bitcasts. With that disclaimer in mind, I expect it's

                  fine for the LLVM-IR level optimizations to handle

                  them using either definition since they are equivalent

                  at the LLVM-IR level. My thinking is that LLVM-IR is

                  consistent about how virtual bits are assigned to

                  types and that non-zero instruction nops arise when

                  there is inconsistency.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">At

                  the LLVM-IR level, bits 0-127 of <4 x i32> map

                  directly onto bits 0-127 of <2 x i64> using the

                  identity map. It's therefore ok to interpret such

                  bitcasts as zero-instruction no-ops. As far as I can

                  tell, LLVM-IR has been defined such that the identity

                  map can be used for bitcasts between all same-sized

                  types, and also such that bitcasting between

                  different-sized types is invalid.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Similarly,

                  most targets have a single mapping of virtual bit

                  numbers to physical bit numbers for each size that is

                  applied consistently when mapping a type to memory.

                  For example 32-bits map like so:</span></p>

              <p class="MsoNormal" style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Little

                  Endian Targets: virtual register bits

                  {0..7,8..15,16..23,24..31} map to physical memory bits

                  {0..7,8..15,16..23,24..31}</span></p>

              <p class="MsoNormal" style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Big

                  Endian Targets: virtual register bits

                  {0..7,8..15,16..23,24..31} map to physical memory bits

                  {24..31,16..23,8..15,0..7}</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">regardless

                  of whether it's a float, or an i32. We therefore need

                  zero instructions to re-map physical memory bits for

                  one type onto another type.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">The

                  same idea holds for physical register classes. There's

                  a single consistent mapping from physical memory bits

                  to physical register bits that applies for all types

                  that can be stored in that class. As long as this is

                  the case the load/store and zero-instruction

                  interpretation of bitcasts are equivalent.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">In

                  the case of big-endian MSA and NEON, there isn't a

                  single consistent mapping from physical memory bits to

                  physical register bits so the equivalence in the two

                  definitions breaks down:</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">               

                  i128: virtual register bits {0..31, 32..63, 64..95,

                  96...127} map to physical memory bits {96..127,

                  64..95, 32..63, 0..31}</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">               

                  <4 x i32>: virtual register bits {0..31, 32..63,

                  64..95, 96...127} map to physical memory bits {0..31,

                  32..63, 64..95, 96..127}</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">               

                  <2 x i64>: virtual register bits {0..31, 32..63,

                  64..95, 96...127} map to physical memory bits {32..63,

                  0..31, 96..127, 64..95}</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">with

                  these inconsistent mappings we require instructions to

                  bitcast between the types.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                  found this thinking quite difficult to explain. Does

                  it make sense?</span></p>

            </div>

          </div>

          <div link="blue" vlink="purple" lang="EN-GB">

            <div>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">>

                </span>I am fine with treating bit casts as equivalent

                store/load pairs in GISel, I just want to be sure we do

                not have a semantic gap between the LLVM-IR and the

                backend if we do.</p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

            </div>

          </div>

          <div link="blue" vlink="purple" lang="EN-GB">

            <div>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                  think a gap would arise from not having a GISel

                  equivalent to ISD::BITCAST (gBITCAST?) available when

                  it's necessary for correctness. However, I agree that

                  GISel should delete bitcasts for the common case where

                  the store/load and zero-instruction definitions are

                  equivalent.</span></p>

              <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

              <div style="border:none;border-left:solid blue

                1.5pt;padding:0cm 0cm 0cm 4.0pt">

                <div>

                  <div style="border:none;border-top:solid #b5c4df

                    1.0pt;padding:3.0pt 0cm 0cm 0cm">

                    <p class="MsoNormal"><b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                          lang="EN-US">From:</span></b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                        lang="EN-US"> Quentin Colombet [mailto:<a

                          moz-do-not-send="true"

                          href="mailto:qcolombet@apple.com"

                          target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:qcolombet@apple.com">qcolombet@apple.com</a></a>]

                        <br>

                        <b>Sent:</b> 11 January 2016 17:23<br>

                        <b>To:</b> Daniel Sanders<br>

                        <b>Cc:</b> Tim Northover (<a

                          moz-do-not-send="true"

                          href="mailto:t.p.northover@gmail.com"

                          target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:t.p.northover@gmail.com">t.p.northover@gmail.com</a></a>);

                        llvm-dev</span></p>

                  </div>

                </div>

              </div>

            </div>

          </div>

          <div link="blue" vlink="purple" lang="EN-GB">

            <div>

              <div style="border:none;border-left:solid blue

                1.5pt;padding:0cm 0cm 0cm 4.0pt">

                <div>

                  <div style="border:none;border-top:solid #b5c4df

                    1.0pt;padding:3.0pt 0cm 0cm 0cm">

                    <p class="MsoNormal"><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                        lang="EN-US"><br>

                        <b>Subject:</b> Re: [llvm-dev] [GlobalISel] A

                        Proposal for global instruction selection</span></p>

                  </div>

                </div>

              </div>

            </div>

          </div>

          <div link="blue" vlink="purple" lang="EN-GB">

            <div>

              <div style="border:none;border-left:solid blue

                1.5pt;padding:0cm 0cm 0cm 4.0pt">

                <p class="MsoNormal"> </p>

                <p class="MsoNormal">Hi Daniel,</p>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">Thanks for the pointers, I wasn’t

                    aware of the second thread you’ve mentioned.</p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">I may be wrong but I think

                    LLVM-IR optimizations really treat bistcasts as

                    no-op casts, in the sense of no instructions are

                    required.</p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">Is there anyone that could chime

                    in on that?</p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">However, it seems SelectionDAG

                    sticks to the load/store semantic:</p>

                </div>

                <div>

                  <p class="MsoNormal"><span

                      style="font-size:10.0pt;font-family:"Lucida

                      Grande","serif";background:#fbfcfd">"BITCAST

                      - This operator converts between integer, vector

                      and FP values, as if the value was

                      <b>stored to memory with one type and loaded from

                        the same address with the other type</b> (or

                      equivalently for vector format conversions, etc)."</span></p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">I am fine with treating bit casts

                    as equivalent store/load pairs in GISel, I just want

                    to be sure we do not have a semantic gap between the

                    LLVM-IR and the backend if we do.</p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                </div>

                <div>

                  <p class="MsoNormal">Thanks,</p>

                </div>

                <div>

                  <p class="MsoNormal">-Quentin</p>

                </div>

                <div>

                  <p class="MsoNormal"> </p>

                  <div>

                    <blockquote

                      style="margin-top:5.0pt;margin-bottom:5.0pt">

                      <div>

                        <p class="MsoNormal">On Jan 11, 2016, at 7:43

                          AM, Daniel Sanders <<a

                            moz-do-not-send="true"

                            href="mailto:Daniel.Sanders@imgtec.com"

                            target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:Daniel.Sanders@imgtec.com">Daniel.Sanders@imgtec.com</a></a>>

                          wrote:</p>

                      </div>

                      <p class="MsoNormal"> </p>

                      <div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Hi,</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">It

                              was a comment by Tim that first made me

                              aware of it (see<span> </span><a

                                moz-do-not-send="true"

                                href="http://lists.llvm.org/pipermail/llvm-dev/2013-August/064714.html"

                                target="_blank"><span

                                  style="color:purple"><a class="moz-txt-link-freetext" href="http://lists.llvm.org/pipermail/llvm-dev/2013-August/064714.html">http://lists.llvm.org/pipermail/llvm-dev/2013-August/064714.html</a></span></a><span> </span>but

                              I think he commented on one of my patches

                              before that).</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                              asked about it on llvm-dev a couple weeks

                              later (<a moz-do-not-send="true"

                                href="http://lists.llvm.org/pipermail/llvm-dev/2013-August/064919.html"

                                target="_blank"><span

                                  style="color:purple">http://lists.llvm.org/pipermail/llvm-dev/2013-August/064919.html</span></a>)

                              highlighting the contradiction and was

                              told that 'no-op cast' referred to the

                              lack of math rather than a requirement

                              that zero instructions are used. It's

                              therefore my understanding that shuffling

                              the bits to preserve the load/store based

                              definition isn't considered to be changing

                              the bits.</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                              think the main thing the current

                              definition is unclear on is whether it

                              refers to the bits in a physical machine

                              register or the bits in the LLVM-IR

                              virtual register. Most of the time these

                              two views are the same but this doesn't

                              quite work for big-endian MSA/NEON. For

                              example:</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"

                            style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">%0

                              = bitcast <4 x i32> <i32 1, i32

                              2, i32 3, i32 4> to <2 x i64></span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"

                            style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">%0

                              = <2 x i64> <i64 (1 << 32)

                              | 2, i64 (3 << 32) | 4></span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">are

                              equivalent to each other in LLVM-IR terms

                              but the constants are physically laid out

                              in MSA registers as:</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"

                            style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">0x00000004000000030000000200000001

                              # <4 x i32> <i32 1, i32 2, i32 3,

                              i32 4></span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"

                            style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">0x00000003000000040000000100000002

                              # <2 x i64> <i64 (1 << 32)

                              | 2, i64 (3 << 32) | 4></span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">and

                              we must therefore shuffle the bits to

                              preserve LLVM-IR's point of view.</span></p>

                        </div>

                        <div>

                          <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                        </div>

                        <div style="border:none;border-left:solid blue

                          1.5pt;padding:0cm 0cm 0cm 4.0pt">

                          <div>

                            <div style="border:none;border-top:solid

                              #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">

                              <div>

                                <p class="MsoNormal"><b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                      lang="EN-US">From:</span></b><span><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                      lang="EN-US"> </span></span><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                    lang="EN-US">Quentin Colombet [<a

                                      moz-do-not-send="true"

                                      href="mailto:qcolombet@apple.com"

                                      target="_blank"><a class="moz-txt-link-freetext" href="mailto:qcolombet@apple.com">mailto:qcolombet@apple.com</a></a>]<span> </span><br>

                                    <b>Sent:</b><span> </span>07 January

                                    2016 19:58<br>

                                    <b>To:</b><span> </span>Daniel

                                    Sanders<br>

                                    <b>Cc:</b><span> </span>llvm-dev<br>

                                    <b>Subject:</b><span> </span>Re:

                                    [llvm-dev] [GlobalISel] A Proposal

                                    for global instruction selection</span></p>

                              </div>

                            </div>

                          </div>

                          <div>

                            <p class="MsoNormal"> </p>

                          </div>

                          <div>

                            <p class="MsoNormal">Hi Daniel,</p>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal"> </p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">I had a quick look at

                                the language reference for bitcast and I

                                have a different reading than what you

                                were pointing out.</p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">Indeed, my take away

                                is:</p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal"><span

                                  style="font-size:10.5pt;font-family:"Lucida

                                  Sans

                                  Unicode","sans-serif";background:white">"It

                                  is<span> </span><b>always a </b></span><em><b><span

                                      style="font-size:10.5pt;font-family:"Lucida

                                      Sans

                                      Unicode","sans-serif"">no-op

                                      cast</span></b></em><span

                                  style="font-size:10.5pt;font-family:"Lucida

                                  Sans

                                  Unicode","sans-serif";background:white"> because

                                  no bits change with this conversion."</span></p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal"> </p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">In other words,

                                deleting all bitcast instructions should

                                be fine.</p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal"> </p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">My understanding of

                                the quote you’ve highlighted is that it

                                tells C programmers that this is like a

                                memcpy, not a cast :).</p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal"> </p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">Cheers,</p>

                            </div>

                          </div>

                          <div>

                            <div>

                              <p class="MsoNormal">-Quentin</p>

                            </div>

                            <div>

                              <blockquote

                                style="margin-top:5.0pt;margin-bottom:5.0pt">

                                <div>

                                  <div>

                                    <p class="MsoNormal">On Nov 20,

                                      2015, at 6:53 AM, Daniel Sanders

                                      <<a moz-do-not-send="true"

                                        href="mailto:Daniel.Sanders@imgtec.com"

                                        target="_blank"><span

                                          style="color:purple">Daniel.Sanders@imgtec.com</span></a>>

                                      wrote:</p>

                                  </div>

                                </div>

                                <div>

                                  <p class="MsoNormal"> </p>

                                </div>

                                <div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Hi,</span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I

                                          haven't had chance to read all

                                          of this yet, but one minor

                                          thing occurred to me during

                                          your presentation that I want

                                          to mention. At one point you

                                          mentioned deleting all the

                                          bitcast instructions since

                                          they're equivalent to nops but

                                          this isn't always true.</span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">The<span> </span><a

                                            moz-do-not-send="true"

                                            href="http://llvm.org/docs/LangRef.html"

                                            target="_blank"><span

                                              style="color:purple"><a class="moz-txt-link-freetext" href="http://llvm.org/docs/LangRef.html">http://llvm.org/docs/LangRef.html</a></span></a><span> </span>definition

                                          of the bitcast instruction

                                          includes this sentence:</span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"

                                        style="text-indent:36.0pt"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">The

                                          conversion is done as if the

                                          value had been stored to

                                          memory and read back as type

                                          ty2.</span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif"">For

                                          big-endian MSA, this is

                                          equivalent to a shuffling of

                                          the bits in the register

                                          because endianness only

                                          changes the byte order within

                                          each element. The order of the

                                          elements is unaffected by

                                          endianness. IIRC, big-endian

                                          NEON is the same way.</span></p>

                                    </div>

                                  </div>

                                  <div>

                                    <div>

                                      <p class="MsoNormal"><span

style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span></p>

                                    </div>

                                  </div>

                                  <div

                                    style="border:none;border-left:solid

                                    blue 1.5pt;padding:0cm 0cm 0cm

                                    4.0pt">

                                    <div>

                                      <div

                                        style="border:none;border-top:solid

                                        #b5c4df 1.0pt;padding:3.0pt 0cm

                                        0cm 0cm">

                                        <div>

                                          <div>

                                            <p class="MsoNormal"><b><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                                  lang="EN-US">From:</span></b><span><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                                  lang="EN-US"> </span></span><span

style="font-size:10.0pt;font-family:"Tahoma","sans-serif""

                                                lang="EN-US">llvm-dev [<a

                                                  moz-do-not-send="true"

href="mailto:llvm-dev-bounces@lists.llvm.org" target="_blank"><span

                                                    style="color:purple"><a class="moz-txt-link-freetext" href="mailto:llvm-dev-bounces@lists.llvm.org">mailto:llvm-dev-bounces@lists.llvm.org</a></span></a>]<span> </span><b>On

                                                  Behalf Of<span> </span></b>Quentin

                                                Colombet via llvm-dev<br>

                                                <b>Sent:</b><span> </span>18

                                                November 2015 19:27<br>

                                                <b>To:</b><span> </span>llvm-dev<br>

                                                <b>Subject:</b><span> </span>[llvm-dev]

                                                [GlobalISel] A Proposal

                                                for global instruction

                                                selection</span></p>

                                          </div>

                                        </div>

                                      </div>

                                    </div>

                                    <div>

                                      <div>

                                        <p class="MsoNormal"> </p>

                                      </div>

                                    </div>

                                    <div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal">Hi,<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>With this email, I

                                              would like to kick-off the

                                              development for the next

                                              instruction selector that

                                              I described during the

                                              last LLVM Dev’ Meeting.<br>

                                              For the motivations, see

                                              Jakob’s proposal (<a

                                                moz-do-not-send="true"

href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064727.html"

                                                target="_blank"><span

                                                  style="color:purple"><a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064727.html">http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064727.html</a></span></a>)

                                              and for the proposal, see

                                              the slides (Keynote: <a

                                                moz-do-not-send="true"

href="http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.key?view=co"

                                                target="_blank"><span

                                                  style="color:purple"><a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.key?view=co">http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.key?view=co</a></span></a> or

                                              PDF: <a

                                                moz-do-not-send="true"

href="http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.pdf?revision=252430&view=co"

                                                target="_blank"><span

                                                  style="color:purple"><a class="moz-txt-link-freetext" href="http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.pdf?revision=252430&view=co">http://llvm.org/viewvc/llvm-project/www/trunk/devmtg/2015-10/slides/Colombet-GlobalInstructionSelection.pdf?revision=252430&view=co</a></span></a>)

                                              or the talk (<a

                                                moz-do-not-send="true"

href="https://www.youtube.com/watch?v=F6GGbYtae3g&list=PL_R5A0lGi1AA4Lv2bBFSwhgDaHvvpVU21&index=2"

                                                target="_blank"><span

                                                  style="color:purple"><a class="moz-txt-link-freetext" href="https://www.youtube.com/watch?v=F6GGbYtae3g&list=PL_R5A0lGi1AA4Lv2bBFSwhgDaHvvpVU21&index=2">https://www.youtube.com/watch?v=F6GGbYtae3g&list=PL_R5A0lGi1AA4Lv2bBFSwhgDaHvvpVU21&index=2</a></span></a>).</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal"><br>

                                              TL;DR This is happening

                                              now, feedbacks invited!<br>

                                              <br>

                                              *** Context ***<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>During the last

                                              LLVM Dev’ Meeting, I have

                                              presented a proposal for

                                              the next instruction

                                              selector, GlobalISel. The

                                              proposal is basically

                                              summarized in "High Level

                                              Prototype Design” and

                                              “Roadmap”. (If you want

                                              further details, feel free

                                              to reach me.)<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>The first step of

                                              the development plan is to

                                              prototype the new

                                              framework on open source.

                                              The idea is to <b>start

                                                prototyping now(!)</b> and

                                              have the discussion

                                              ongoing in parallel. The

                                              reason of such approach is

                                              to have code that can be

                                              used to inform those

                                              discussions, e.g., by

                                              collecting data and trying

                                              different designs

                                              approaches. Regarding the

                                              discussion, I have listed

                                              a few points where your

                                              feedbacks would be

                                              particularly appreciated

                                              (see Feedback Invite).</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal"><span

                                                style="color:#00afcd"><br>

                                              </span>Also, as I have

                                              mentioned in my talk, some

                                              issues are controversial

                                              but I expect them to be

                                              resolved during prototype

                                              development. Specifically

                                              theses concern aspects of

                                              legalization (should parts

                                              of it be done at the LLVM

                                              IR level or all at the MI

                                              level?) and code re-use

                                              for instruction combiner.

                                              Please feel free to bring

                                              up your specific concern

                                              as I move along with the

                                              development plan.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>I expect the design

                                              to evolve with our

                                              experimental findings and

                                              your feedbacks

                                              and contributions.<br>

                                              Nonetheless, we expect to

                                              nail down some design

                                              decisions once and for all

                                              as the prototype

                                              progresses. I have

                                              highlighted them with

                                              the following pattern <b>[final]</b>.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                                <br>

                                                <br>

                                              </span>*** Feedback Invite

                                              ***<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>If you follow and

                                              support this work you need

                                              to be aware of three

                                              things and I am eager to

                                              hear your feedback and

                                              thoughts about them: the

                                              overall goals of Global

                                              ISel, the goals of the

                                              prototype, and the impact

                                              of the prototype work on

                                              backend design. <br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>In the section

                                              “Goals", I defined

                                              (repeated for people that

                                              saw the talk) the goals

                                              for the Global ISel

                                              design.<br>

                                              - Do you see anything

                                              missing?<br>

                                              - Do you see something

                                              that should not be there? <br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>The prototype will

                                              answer critical design

                                              questions (see “Design

                                              Questions the Prototype

                                              Addresses at the End of

                                              M1" for examples) before

                                              the actual design of Gobal

                                              ISel is finalized, but it

                                              cannot cover everything.<br>

                                              Specifically we will <b>*not*</b> look

                                              into improving TableGen or

                                              reuse InstCombine (see “

                                              Proposed Approach” for the

                                              rational). Please let me

                                              know if you see any issue

                                              with that.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>There is also basic

                                              ground work needed to

                                              prepare for Global ISel

                                              and I need to extend the

                                              core MachineInstr-level

                                              APIs as explained during

                                              the talk. For this, I

                                              prepared sketches of

                                              patches to illustrate them

                                              and describe the details

                                              in the “Implications”

                                              section below. Please have

                                              a look at the patches to

                                              have a better idea of the

                                              expected impact.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>If there is

                                              anything else you want to

                                              discuss related to Global

                                              ISel feel free to reach

                                              me. In particular, several

                                              people expressed their

                                              interests during the LLVM

                                              Dev Meeting in

                                              contributing to the

                                              project. Let me know what

                                              is your area of interest,

                                              so that we can coordinate

                                              our efforts.<br>

                                              Anyhow, please add

                                              [GlobalISel] in the

                                              subject line to help

                                              categorizing the emails.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                                <br>

                                                <br>

                                              </span>*** Goals ***<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>The high level

                                              goals of the new

                                              instruction selector are:<br>

                                              - Global instruction

                                              selector.<br>

                                              - Fast instruction

                                              selector.<br>

                                              - Shared code path for

                                              fast and good instruction

                                              selection.<br>

                                              - IR that represents ISA

                                              concepts better.<br>

                                              - More flexible

                                              instruction selector.<br>

                                              - Easier to

                                              maintain/understand

                                              framework, in particular

                                              legalization.<br>

                                              - Self contained machine

                                              representation, no back

                                              links to LLVM IR.<br>

                                              - No change to LLVM IR.<br>

                                              <span

                                                style="color:#5856d6"><br>

                                              </span>Note:  The goals

                                              are common to all targets.

                                              In particular, we do not

                                              intend to work on target

                                              specific feature for the

                                              prototype.<br>

                                              The bottom line is please

                                              make sure those goals are

                                              compatible with what you

                                              want to achieve for your

                                              target, even if your

                                              requirement does not get

                                              listed here.<br>

                                              <br>

                                              <span

                                                style="color:#12c00e"><br>

                                                <br>

                                              </span>*** Proposed

                                              Approach ***<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>In this section, I

                                              describe the approach I

                                              plan to pursue in the

                                              prototype and the roadmap

                                              to get there. The final

                                              design will flow out of

                                              it.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>For this prototype,

                                              we purposely exclude any

                                              work to improve or use

                                              TableGen or InstCombine <b>[final].</b> We

                                              will keep in mind however,

                                              that some of the C++ code

                                              we write will be

                                              table-generated at some

                                              point.<br>

                                              The rational is that we do

                                              not want to lay down a new

                                              TableGen/InstCombine

                                              infrastructure before

                                              being able to work on the

                                              ISel framework itself.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>The prototype

                                              vehicle will be <b>AArch64</b>.

                                              None of the changes for

                                              GlobalISel will negatively

                                              impact the existing ISel.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                                <br>

                                              </span>** High Level

                                              Prototype Design **<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>As shown in the

                                              talk, the expected

                                              pipeline for the prototype

                                              is:<br>

                                              <b>LLVM IR </b>->

                                              IRTranslator -> <b>Generic (G)

                                                MachineInstr</b> ->

                                              Legalizer ->

                                              RegBankSelect -> Select

                                              -> <b>MachineInstr</b><br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>Where:<br>

                                              - Terms in <b>bold</b> are

                                              intermediate

                                              representations.<br>

                                              -  Generic MachineInstrs

                                              are machine instructions

                                              with a generic opcode,

                                              e.g., ADD, COPY.</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal">-

                                              IRTranslator: Translate

                                              LLVM IR to (G)

                                              MachineInstr.<br>

                                              - Legalizer: Legalize

                                              illegal (G) MachineInstr

                                              to legal (G) MachineInstr.<br>

                                              - RegBankSelect: Assign

                                              virtual register with size

                                              to virtual register with

                                              Register Bank.<br>

                                              - Select: Translate the

                                              remaining (G) MachineInstr

                                              to MachineIntr.<br>

                                              <br>

                                              <span

                                                style="color:#00afcd"><br>

                                                <br>

                                              </span>** Implications **<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>As part of the

                                              bring-up of the prototype,

                                              we need to extend some of

                                              the core

                                              MachineInstr-level APIs:<br>

                                                - Need to remember

                                              FastMath flags for each

                                              MachineInstr.<br>

                                                - Need to know the type

                                              of each MachineInstr. We

                                              don’t want ADD8, ADD16,

                                              etc.<br>

                                                - Extend the

                                              MachineRegisterInfo to

                                              support size as well as

                                              register classes for

                                              virtual registers.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>I have sketched the

                                              changes in the attached

                                              patches to help picturing

                                              how the changes would

                                              impact the existing APIs.</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal"> </p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal">Note: I

                                              do not intend to commit

                                              those changes as they are.

                                              They will go the usual

                                              review process in due

                                              time.</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal"><br>

                                              The patches contain “//

                                              ***”-like comment that

                                              give a rough explanation

                                              on why those changes are

                                              needed w.r.t. the goals.<br>

                                              The order of the patches

                                              could be modified since

                                              the dependencies between

                                              those are not sequential.

                                              Anyhow, here are the

                                              patches:<br>

                                              1. Introduce (some of) the

                                              generic opcode.<br>

                                              2. Make MachineFunction

                                              more independent of LLVM

                                              IR to eventually be able

                                              to delete the LLVM IR

                                              instance from the memory.<br>

                                              3. Extend MachineInstr to

                                              represent additional

                                              information attached to

                                              generic opcode.<br>

                                              4. Teach

                                              MachineRegisterInfo about

                                              size for virtual

                                              registers.<br>

                                              5. Introduce a helper

                                              class to build

                                              MachineInstr related

                                              objects.<br>

                                              6. Add new target hooks to

                                              lower the ABI directly to

                                              MachineInstr.<br>

                                              7. Introduce the

                                              IRTranslator pass.<br>

                                              <br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>** Roadmap for the

                                              Prototype **<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>We plan to split

                                              the prototype in three

                                              main milestones:<br>

                                              1. Translation: LLVM IR to

                                              (G) MachineInstr

                                              translation.<br>

                                              2. Basic selector: Legal

                                              LLVM IR to target specific

                                              MachineInstr.<br>

                                              3. Simple legalization:

                                              Support scalar type

                                              legalization and some

                                              vector instructions.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>Notes:<br>

                                              - For #1, we will not

                                              support any fancy

                                              instructions like landing

                                              pad or switch.<br>

                                              - Each milestone should

                                              take about 3-4 months.</p>

                                          </div>

                                        </div>

                                      </div>

                                      <div>

                                        <div>

                                          <div>

                                            <p class="MsoNormal">- At

                                              the end of #2, we would

                                              have a FastISel like

                                              selector.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>Each milestone will

                                              be detailed right before

                                              starting it. The rational

                                              is that we want to

                                              accommodate what we

                                              discovered with the

                                              prototype for the next

                                              milestone. In other words,

                                              in this email, <b>I only

                                                describe the first

                                                milestone</b> in detail

                                              and I will give more

                                              details on the next

                                              milestone shortly before

                                              we start it and so on. For

                                              your information, here is

                                              the remaining of the

                                              intended roadmap for the <b>full</b> project:<br>

                                              4. Productization: Clean

                                              up implementation,

                                              stabilize the APIs.<br>

                                              5. Complex legalization:

                                              Extend legalization

                                              support to everything

                                              missing.<br>

                                              6. Completeness: Fill the

                                              blanks, e.g., landing pad.<br>

                                              7. Clean-up and

                                              performance: Add the

                                              necessary bits to be at

                                              parity or beat

                                              SelectionDAG generated

                                              code.<br>

                                              8. Transition: Document

                                              how to switch, provide

                                              tools to help.<br>

                                              <span

                                                style="color:#00afcd"><br>

                                                <br>

                                              </span>** Milestone 1 **<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>The first phase is

                                              focused on the

                                              IRTranslator pass.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>The IRTranslator is

                                              responsible for

                                              translating the LLVM IR

                                              into Generic MachineInstr.

                                              The IRTranslator pass uses

                                              some target hooks

                                              to perform the ABI

                                              lowering. We can either

                                              define a new API for them,

                                              e.g., ABILoweringInfo, or

                                              extend the existing

                                              TargetLowering.<br>

                                              Moreover, the prototype

                                              will focus on simple

                                              instruction, i.e., we will

                                              not support switch or

                                              landing pad for this

                                              iteration.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>At the end of M1,

                                              the prototype will not be

                                              able to produce code,

                                              since we would only have

                                              the beginning of the

                                              Global ISel pipeline.

                                              Instead, we will test the

                                              IRTranslator on the

                                              generic output that is

                                              produced from the tested

                                              IR.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>* Design Decisions

                                              *<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>- The IRTranslator

                                              is a final class. Its

                                              purpose is to move away

                                              from LLVM IR to

                                              MachineInstr world <b>[final]</b>.<br>

                                              - Lower the ABI as part of

                                              the translation process <b>[final]</b>.<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>* Design Questions

                                              the Prototype Addresses at

                                              the End of M1 *<br>

                                              <span

                                                style="color:#12c00e"><br>

                                              </span>- Handling of

                                              aggregate types during the

                                              translation.<br>

                                              - Lowering of switches.<br>

                                              - What about Module pass

                                              for Machine pass?<br>

                                              - Introduce new APIs to

                                              have a clearer separation

                                              between:<br>

                                                - Legalization

                                              (setOperationAction, etc.)<br>

                                                - Cost/Combine related

                                              (isXXXFree, etc.)<br>

                                                - Lowering related

                                              (LowerFormal, etc.)<br>

                                              - What is the contract

                                              with the backends? Is it

                                              still “should be able to

                                              select any valid LLVM IR”?<br>

                                              <span

                                                style="color:#00afcd"><br>

                                              </span>Thanks,</p>

                                          </div>

                                        </div>

                                        <div>

                                          <div>

                                            <div>

                                              <div>

                                                <div>

                                                  <div>

                                                    <div>

                                                      <div>

                                                        <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <div>

                                                          <p

                                                          class="MsoNormal">-Quentin</p>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                        </div>

                                                      </div>

                                                    </div>

                                                  </div>

                                                </div>

                                              </div>

                                            </div>

                                          </div>

                                        </div>

                                      </div>

                                    </div>

                                  </div>

                                </div>

                              </blockquote>

                            </div>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </div>

                  <p class="MsoNormal"> </p>

                </div>

              </div>

            </div>

          </div>

          _______________________________________________<br>

          LLVM Developers mailing list<br>

          <a moz-do-not-send="true"

            href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

          <a moz-do-not-send="true"

            href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"

            rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

        </blockquote>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>