<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-forward-container">

      <meta content="text/html; charset=ISO-8859-1"

        http-equiv="Content-Type">

      <div class="moz-cite-prefix">Currently it will always spill /

        restore the whole vreg but only spilling the parts that are

        actually live would be a nice addition in the future.<br>

        <br>

        Looking at r192119’: if “mtlo” writes to $LO and sets $HI to an

        unpredictable value, then it should just have an additional

        (dead) def operand for $hi, shouldn’t it?<br>

        <br>

        Greetings<br>

            Matthias<br>

        <br>

        Am 10/8/13, 11:03 AM, schrieb Akira Hatanaka:<br>

      </div>

      <blockquote

cite="mid:CAB297QxDzp4kV6okq_KEMzLThU0nu0tR_u=KgDBXDwNCyN_Luw@mail.gmail.com"

        type="cite">

        <div dir="ltr">

          <div>Hi,</div>

          <div><br>

          </div>

          <div>I have a question about the way sub-registers are spilled

            and restored that is related to the changes I made in

            r192119.</div>

          <div><br>

          </div>

          <div>Suppose I have the following piece of code with four

            instructions. %vreg0 and %vreg1 consist of two sub-registers

            indexed by sub_lo and sub_hi.</div>

          <div><br>

          </div>

          instr0 %vreg0<def>

          <div>instr1 %vreg1:sub_lo<def<span

              style="font-family:arial,sans-serif;font-size:13px">,read-undef</span>><br>

          </div>

          <div>instr2 %vreg0<use></div>

          <div>

            <div>instr3 %vreg1:sub_hi<def></div>

          </div>

          <div><br>

          </div>

          <div>If register allocator decides to insert spill and restore

            instructions for %vreg0, will it spill the whole register

            that includes sub-registers lo and hi?</div>

          <div><br>

          </div>

          <div>instr0 %vreg0<def></div>

          <div>spill0 %vreg0<br>

            <div>instr1 %vreg1:sub_lo<def<span

                style="font-family:arial,sans-serif;font-size:13px">,read-undef</span>><br>

            </div>

            spill1 %vreg1:sub_lo<br>

            restore0 %vreg0<br>

            <div>instr2 %vreg0<use></div>

            restore1 %vreg1:sub_lo<br>

            <div>instr3 %vreg1:sub_hi<def></div>

          </div>

          <div><br>

          </div>

          <div>Or will it spill just the lo sub-register?</div>

          <div><br>

          </div>

          <div>

            <div>instr0 %vreg0<def></div>

            <div>spill0 %vreg0:sub_lo<br>

              <div>instr1 %vreg1:sub_lo<def<span

                  style="font-family:arial,sans-serif;font-size:13px">,read-undef</span>><br>

              </div>

              spill1 %vreg1:sub_lo<br>

              restore0 %vreg0:sub_lo<br>

              <div>instr2 %vreg0<use></div>

              restore1 %vreg1:sub_lo<br>

              <div>instr3 %vreg1:sub_hi<def></div>

            </div>

          </div>

          <div><br>

          </div>

          <div>If it spills the whole register (both sub-registers lo

            and hi), the changes I made should be fine. Otherwise, I

            will have to find another way to prevent the problems I

            mentioned in r192119's commit log.</div>

          <div><br>

          </div>

        </div>

        <div class="gmail_extra"><br>

          <br>

          <div class="gmail_quote">On Mon, Oct 7, 2013 at 1:11 PM,

            Matthias Braun <span dir="ltr"><<a

                moz-do-not-send="true" href="mailto:matze@braunis.de"

                target="_blank">matze@braunis.de</a>></span> wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">I've

              been working on patches to improve subregister liveness

              tracking on llvm and I wanted to inform the llvm community

              about the overal design/motivation for them. I will send

              the patches to llvm-commits later today.<br>

              <br>

              Greetings<br>

                  Matthias Braun<br>

              <br>

              <br>

              Subregisters in llvm<br>

              ====================<br>

              <br>

              Some targets can access registers in different ways

              resulting in wider or<br>

              narrower accesses. For example on ARM NEON one of the

              single precision<br>

              floating point registers is called 'S0'. You may also

              access 'D0' on arm which<br>

              is the combination of 'S0' and 'S1' and can store a double

              prevision number or<br>

              2 single precision floats. 'Q0' is the combination of

              'S0', 'S1', 'S2' and<br>

              'S3' (or 'D0' and 'D1') and so on.<br>

              <br>

              Before register allocation llvm machine code accesses

              values through virtual<br>

              registers, these get assigned to physical registers later.

              Each virtual<br>

              register has an assigned register class which is a set of

              physical registers.<br>

              So for example on ARM you have a register class containing

              all the 'SXX'<br>

              registers and another one containing all the 'DXX'

              registers, ...<br>

              <br>

              But sometimes you want to mix narrow and wide accesses to

              values. Like loading<br>

              the 'D0' register but later reading the 'S0' and 'S1'

              components separately.<br>

              This is modeled with subregister operands which specify

              that only parts of a<br>

              wider value are accessed. For example the register class

              of the 'DXX'<br>

              registers supports subregisters calls 'ssub_0' and

              'ssub_1' which would<br>

              result in 'S4' and 'S5' getting used if 'D2' is assigned

              to the virtual<br>

              register later.<br>

              <br>

              Typical operations are decomposing wider values or

              composing wide values with<br>

              multiple smaller defs:<br>

              <br>

              Decomposing:<br>

              %vreg1<def> = produce a 'D' value<br>

                          = use 'S' value %vreg1:ssub_0<br>

                          = use 'S' value %vreg1:ssub_1<br>

              <br>

              Composing:<br>

              %vreg1:ssub_0<def,read-undef> = produce an 'S' value<br>

              %vreg1:ssub_1<def>            = produce an 'S' value<br>

                         = use a 'D' value %vreg1<br>

              <br>

              Problems / Motivation<br>

              =====================<br>

              <br>

              Currently the llvm register allocator tracks liveness for

              whole virtual<br>

              registers. This can lead to suboptimal code:<br>

              <br>

              %vreg0:ssub_0<def,read-undef> = produce an 'S' value<br>

              %vreg0:ssub_1<def> = produce an 'S' value<br>

                     = use a 'D' value %vreg0<br>

              %vreg1 = produce an 'S' value<br>

                     = use an 'S' value %vreg1<br>

                     = use an 'S' value %vreg0:ssub_0<br>

              <br>

              The current code will realize that vreg0 and vreg1

              interfere and assign them<br>

              to different registers like D0+S2 aka S0+S1+S2; while in

              reality after the<br>

              full use of %vreg0 only %vreg0::ssub_0 must remain in a

              register while the<br>

              subregister used for %vreg0:ssub_1 can be reassigned to

              %vreg1. An ideal<br>

              assignment would be D0+S1 aka S0+S1.<br>

              <br>

              A even more pressing problem are artificial dependencies

              in the schedule<br>

              graph. This is a side effect of llvms live range

              information being represented<br>

              in a static single assignment like fashion: Every

              definition of a vreg starts<br>

              a new interval with a new value number. This means that

              partial register<br>

              writes must be modeled as an implicit use of the unwritten

              parts of a register<br>

              and force the creating of a new value number. This in turn

              leads to artificial<br>

              dependencies in the schedule graph for code like the

              following where all defs<br>

              should be independent:<br>

              <br>

              %vreg0:ssub_0<def,read-undef> = produce an 'S' value<br>

              %vreg0:ssub_1<def>            = produce an 'S' value<br>

              %vreg0:ssub_2<def>            = produce an 'S' value<br>

              %vreg0:ssub_3<def>            = produce an 'S' value<br>

              <br>

              <br>

              Subegister liveness tracking<br>

              ============================<br>

              <br>

              I developed a set of patches which enable liveness

              tracking on the subregister<br>

              level, to overcome the problems mentioned above. After

              these changes you can<br>

              have separate live ranges for subregisters of a virtual

              register. With these<br>

              patches the following code:<br>

              <br>

                16B  %vreg0:ssub_0<def,read-undef> = ...<br>

                32B  %vreg0:ssub_1<def>            = ...<br>

                48B               = %vreg0<br>

                64B               = %vreg0:ssub_0<br>

                80B  %vreg0 = ...<br>

                96B         = %vreg0:ssub_1<br>

              <br>

              will be represented as the following live range(s):<br>

              <br>

                Common LiveRange: [16r,32r)[32r,64r),[80r,96r)<br>

                SubRange with Mask 0x0004 (=ssub_0): [16r,64r)[80r,80d)<br>

                SubRange with Mask 0x0008 (=ssub_1): [32r,48r)[80r,96r)<br>

              <br>

              Patches/Changes:<br>

              * Moves live range management code in the LiveInterval

              class to a new<br>

                class LiveRange, move the previous LiveRange class

              (which was just a single<br>

                interval inside a live range) to LiveRange::Segment.<br>

                LiveInterval is made a subclass of LiveRange, other code

              paths like<br>

                register units liveness use LiveRange instead of

              LiveInterval now.<br>

              * Introduce a linked list of SubRange objects to the

              LiveInterval class.<br>

                A SubRange is a subclass of LiveRange and contains a

              LaneMask indicating<br>

                which subregisters are represented.<br>

              * Various algorithms have been adapted to

              calculate/preserve subregister<br>

                liveness.<br>

              * The register allocator has been adapted to track

              interference at the<br>

                subregister level (LaneMasks are mapped to register

              units)<br>

              <br>

              Note that SubRegister liveness tracking has to be

              explicitely enabled by the<br>

              target architecture, as it does not provide enough

              benefits for the costs on<br>

              some targets (e.g. having subregister liveness for the

              lower/upper 8bit regs<br>

              on x86 provided nearly no benefits in the llvm-testsuite,

              so you can't justify<br>

              more computations/memory usage for that.<br>

              _______________________________________________<br>

              LLVM Developers mailing list<br>

              <a moz-do-not-send="true"

                href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>

                      <a moz-do-not-send="true"

                href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>

              <a moz-do-not-send="true"

                href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev"

                target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>

            </blockquote>

          </div>

          <br>

        </div>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

LLVM Developers mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>

</pre>

      </blockquote>

      <br>

      <br>

    </div>

    <br>

  </body>

</html>