<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Oct 9, 2020, at 12:07 AM, Dominik Montada <<a href="mailto:dominik.montada@hightec-rt.com" class="">dominik.montada@hightec-rt.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
  
  <div class=""><p class="">Hi Quentin,<br class="">
    </p>
    <div class="moz-cite-prefix">Am 08.10.20 um 21:17 schrieb Quentin
      Colombet:<br class="">
    </div>
    <blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
      Hi Dominik,<br class="">
      <div class=""><br class="">
        <blockquote type="cite" class="">
          <div class="">On Oct 8, 2020, at 5:03 AM, Dominik Montada <<a href="mailto:dominik.montada@hightec-rt.com" class="" moz-do-not-send="true">dominik.montada@hightec-rt.com</a>>
            wrote:</div>
          <br class="Apple-interchange-newline">
          <div class="">
            <meta http-equiv="Content-Type" content="text/html;
              charset=UTF-8" class="">
            <div class=""><p class="">Hi Quentin,</p><p class="">thanks for picking up the conversation!<br class="">
              </p>
              <div class=""><p class="">> I think we should step back and check
                  what we want before investing any time in some
                  rewrite.</p><p class="">That is a very fair point and I might have
                  been getting ahead of myself in my last email.<br class="">
                  What I would like to see from RegBankSelect is to
                  produce the mapping with the overall lowest cost.
                  Keeping track of all different combinations of
                  mappings will certainly be non-trivial however, so I
                  wonder if there is a smart way to do this without
                  spending too much compilation time. <br class="">
                </p><p class="">Ideally for instructions with no operands
                  (like G_CONSTANT) it could also check whether a
                  cross-bank copy is actually worth it or if it would be
                  more beneficial to simply rematerialize the
                  instruction on the required bank. For such
                  instructions this information should already be
                  available as part of the cost-modelling in
                  RegBankSelect: we could simply compare the cost of a
                  mapping on the required bank vs. the cost of a
                  cross-bank copy.<br class="">
                </p><p class="">Would you see this as a valid direction for
                  RegBankSelect?</p>
              </div>
            </div>
          </div>
        </blockquote>
        Yes, I think this is a valid direction. I actually think we
        shouldn’t restrict ourselves to instructions with no operands.
        We could always duplicate next to the original instruction,
        i.e., we wouldn’t have any issue materializing the arguments.</div>
    </blockquote>
    Sure, I guess that would also work as long we don't have to
    introduce copies for the operands. Theoretically this could even try
    to duplicate the operands if overall it's still cheaper than a
    cross-bank copy but that is probably some pandoras box just waiting
    to be opened :)
    </div></div></blockquote><div><br class=""></div><div>Hehe!</div><div>Ideally we would compute the costs globally and do the code transformation only once. But I am guessing this is wishful thinking.</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
      <div class="">The one thing that needs work is the cost model.</div>
    </blockquote><p class="">What would you say does the cost model need in its current state?</p><div class=""><br class=""></div></div></div></blockquote><div><br class=""></div>I don’t remember how everything works so take it with a grain of salt.</div><div>I’d say we would need to take into account how much it would cost to duplicate the definitions and compute the repairing cost with those duplications.</div><div>The thing is it may become compile time intensive pretty quickly if we want to consider all the possibilities.</div><div><br class=""></div><div>As a starter we could focus on the best mapping for the current instruction and try to match the desired regbanks from each operand and only evaluate that cost against just plain repairing.</div><div><br class=""></div><div>In any case, I think it would require a non-trivial amount of work.</div><div><blockquote type="cite" class=""><div class=""><div class=""><p class="">By the way, does this have anything to do with what Matt was
      talking about during the round table? Unfortunately I don't
      remember exactly what his issues with RegBankSelect were but I'd
      be interested to know whether what we talked about would also
      benefit him.</p><div class=""><br class=""></div></div></div></blockquote><div><br class=""></div>Partially. IIRC Matt’s problem was also how we apply the mapping and how, right now, we have to abuse the observer mechanism to do what we want.<br class=""><br class=""><blockquote type="cite" class=""><div class=""><div class=""><p class="">I do remember that AMDGPU is apparently doing something very
      different compared to AArch64 for example and I'd be interested to
      know the reasoning behind the different approaches.<br class="">
    </p><p class="">Cheers,</p><p class="">Dominik<br class="">
    </p>
    <blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
      <div class=""><br class="">
      </div>
      <div class="">Cheers,</div>
      <div class="">-Quentin</div>
      <div class=""><br class="">
        <blockquote type="cite" class="">
          <div class="">
            <div class="">
              <div class=""><p class="">Best regards,</p><p class="">Dominik<br class="">
                </p>
              </div>
              <div class=""><br class="">
              </div>
              <div class="moz-cite-prefix">Am 07.10.20 um 19:47 schrieb
                Quentin Colombet:<br class="">
              </div>
              <blockquote type="cite" cite="mid:6932924D-B430-4259-9176-6A4BE77D64B0@apple.com" class="">
                <meta http-equiv="Content-Type" content="text/html;
                  charset=UTF-8" class="">
                Hi Dominik,
                <div class=""><br class="">
                </div>
                <div class="">Thanks for sending this!<br class="">
                  <div class=""><br class="">
                    <blockquote type="cite" class="">
                      <div class="">On Oct 7, 2020, at 5:21 AM, Dominik
                        Montada via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
                        wrote:</div>
                      <br class="Apple-interchange-newline">
                      <div class="">
                        <div class="">Hi all,<br class="">
                          <br class="">
                          this is the second email for the round table
                          follow-up, this time regarding the issues
                          around the greedy RegBankSelect and
                          alternative mappings.<br class="">
                          <br class="">
                          The issue I brought up was that because
                          RegBankSelect goes top-down, it never looks at
                          all available mappings for the operands when
                          considering which of the mappings to apply to
                          the current instruction. In our architecture
                          we have one register bank dedicated to
                          pointers and another one for anything else. We
                          often see code where we have a G_PTR_ADD with
                          a constant. Since the constant is not a
                          pointer, we put it on the other register bank.
                          We could put it on the address regbank and do
                          provide alternative mappings for that, but
                          since the greedy algorithm doesn't actually
                          check the usage of the constant, it is always
                          put on the other bank.<br class="">
                        </div>
                      </div>
                    </blockquote>
                    <div class=""><br class="">
                    </div>
                    <div class="">The intent behind the greedy algorithm
                      was for the mapping to look at X instructions
                      ahead, depending on the optimization level, when
                      assigning one instruction.</div>
                    <div class="">Right now the window is simply 1
                      instruction and the code is not structured to
                      allow to have more than one.</div>
                    <div class=""><br class="">
                    </div>
                    <div class="">As we gain more insights on what we
                      want RegBankSelect to do, it makes sense to
                      redesign it. </div>
                    <br class="">
                    <blockquote type="cite" class="">
                      <div class="">
                        <div class=""><br class="">
                          When RegBankSelect then sees the G_PTR_ADD and
                          sees that one of its inputs is on the other
                          register bank already, it then inserts a
                          costly cross-bank copy instead of checking if
                          that operand has any alternative mappings
                          which would make the overall mapping for the
                          current instruction cheaper.<br class="">
                        </div>
                      </div>
                    </blockquote>
                    <div class=""><br class="">
                    </div>
                    <div class="">The idea is when a mapping is done,
                      that means it was the best one at the time of the
                      decision (greedy), thus we don’t challenge that.</div>
                    <div class="">“Reverting” an already set mapping is
                      not that simple because yes for this particular
                      use we are inserting costly cross copies, but
                      there are no guarantees that replacing the mapping
                      of the definition will not insert even more costly
                      copies for the other uses.</div>
                    <div class=""><br class="">
                    </div>
                    <div class="">E.g., consider:</div>
                    <div class="">```</div>
                    <div class=""><font class="" face="Menlo">A = def
                        <— RBS starts here</font></div>
                    <div class=""><font class="" face="Menlo">= useFP A</font></div>
                    <div class=""><font class="" face="Menlo">= useInt A</font></div>
                    <div class=""><font class="" face="Menlo">= useInt A</font></div>
                    <div class="">```</div>
                    <div class=""><br class="">
                    </div>
                    <div class="">Let’s assume that greedy works the way
                      it is intended. I.e., it assigns A to the int
                      register bank because there are 2 such uses vs.
                      only 1 fp bank use:</div>
                    <div class="">
                      <div class="">```</div>
                      <div class=""><font class="" face="Menlo">A<int>
                          = def</font></div>
                      <div class=""><font class="" face="Menlo">= useFP
                          A<int> <— Next, RBS looks at this one</font></div>
                      <div class=""><font class="" face="Menlo">= useInt
                          A<int></font></div>
                      <div class=""><font class="" face="Menlo">= useInt
                          A<int></font></div>
                      <div class="">```</div>
                    </div>
                    <div class=""><br class="">
                    </div>
                    Now the regbank for useFP is not right and has to be
                    repaired. Right now, we will insert the costly cross
                    copy for that use:</div>
                  <div class="">
                    <div class="">
                      <div class="">```</div>
                      <div class=""><font class="" face="Menlo">A<int>
                          = def</font></div>
                      <div class=""><font class="" face="Menlo">AFP<fp>
                          = cross_copy A<int></font></div>
                      <div class=""><font class="" face="Menlo">= useFP
                          AFP<fp></font></div>
                      <div class=""><font class="" face="Menlo">= useInt
                          A<int></font></div>
                      <div class=""><font class="" face="Menlo">= useInt
                          A<int></font></div>
                      <div class="">```</div>
                      <div class=""><br class="">
                      </div>
                      <div class="">Now, if we were to change the
                        definition of A to avoid this copy we would
                        create two costly copy for the useInt. Actually,
                        another question is what would we do when we
                        look at the first useInt? Is this use allowed to
                        change the definition of the instruction again?
                        Do we duplicate the definition? Etc..</div>
                      <div class="">
                        <div class="">```</div>
                        <div class=""><font class="" face="Menlo">A<fp>
                            = def <— reassign</font></div>
                        <div class=""><span style="font-family: Menlo;" class="">= useFP AFP<fp></span></div>
                        <div class=""><font class="" face="Menlo">AInt1<int>
                            = cross_copy A<fp></font></div>
                        <div class=""><font class="" face="Menlo">=
                            useInt AInt1<int></font></div>
                        <div class=""><font class="" face="Menlo">AInt2<int>
                            = cross_copy A<fp></font></div>
                        <div class=""><font class="" face="Menlo">=
                            useInt AInt2<int></font></div>
                        <div class="">```</div>
                      </div>
                      <div class=""><br class="">
                      </div>
                      <div class="">Bottom line, if we allow to modify
                        the assignments of the definition, the decision
                        making is not local anymore and in particular
                        may require to add repairing code all over the
                        place. As a result the cost model becomes much
                        more complicated.</div>
                      <div class=""><br class="">
                      </div>
                    </div>
                    <blockquote type="cite" class="">
                      <div class="">
                        <div class=""><br class="">
                          Matt suggested that RegBankSelect should
                          probably go bottom-up instead and I agree with
                          him. I don't think there is a particular
                          reason why RegBankSelect necessarily has to go
                          top-down.<br class="">
                        </div>
                      </div>
                    </blockquote>
                    <div class=""><br class="">
                    </div>
                    <div class="">The rationale for going top-down is
                      that when you reach an instruction, all the
                      operands are assigned so you know what would be
                      the cost of repairing.</div>
                    <div class="">You could said that the problem is the
                      same for definitions when going bottom-up.
                      However, this is likely to be more problematic
                      because usually you have fewer definitions than
                      arguments on each individual instruction,
                      therefore there is more guess work going on (e.g.,
                      top-down, you assume a cost for 1 definition,
                      going bottom-up you have to assume a cost for 2
                      arguments or more precisely you would have to
                      track a window of X instructions for 2 arguments
                      instead of 1 definition.)</div>
                    <br class="">
                    <blockquote type="cite" class="">
                      <div class="">
                        <div class=""><br class="">
                          I'm not too familiar with the implementation
                          of RegBankSelect. Would it be a big effort to
                          make it work bottom-up instead? I'm guessing
                          one of the biggest areas would be the check
                          whether a cross-bank copy is needed as well as
                          calculating the overall cost for alternative
                          mappings as now all usages of the current
                          instruction would have to be checked instead
                          of the much more limited operands. How big of
                          an impact would this have?<br class="">
                        </div>
                      </div>
                    </blockquote>
                    <div class=""><br class="">
                    </div>
                    <div class="">It’s been a while since I looked at
                      the implementation but I would expect this to be
                      significant.</div>
                    <div class=""><br class="">
                    </div>
                    <div class="">I think we should step back and check
                      what we want before investing any time in some
                      rewrite. For instance, I don’t see what bottom-up
                      fundamentally gives us. It seems like a workaround
                      to me.</div>
                    <div class="">That said, it would work either way!</div>
                    <div class=""><br class="">
                    </div>
                    <div class="">Cheers,</div>
                    <div class="">-Quentin</div>
                    <br class="">
                    <blockquote type="cite" class="">
                      <div class="">
                        <div class=""><br class="">
                          Cheers,<br class="">
                          <br class="">
                          Dominik<br class="">
                          <br class="">
_______________________________________________<br class="">
                          LLVM Developers mailing list<br class="">
                          <a href="mailto:llvm-dev@lists.llvm.org" class="" moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br class="">
                          <a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class="">
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br class="">
                </div>
              </blockquote>
              <pre class="moz-signature" cols="72">-- 
----------------------------------------------------------------------
Dominik Montada                   Email: <a class="moz-txt-link-abbreviated" href="mailto:dominik.montada@hightec-rt.com" moz-do-not-send="true">dominik.montada@hightec-rt.com</a>
HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
Europaallee 19                    Fax:   +49-681-92613-26
D-66113 Saarbrücken               WWW: <a class="moz-txt-link-freetext" href="http://www.hightec-rt.com/" moz-do-not-send="true">http://www.hightec-rt.com</a>

Managing Director: Vera Strothmann
Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient please notify the sender immediately
and destroy this e-mail. Any unauthorised copying, disclosure or
distribution of the material in this e-mail is strictly forbidden.
--- </pre>
            </div>
          </div>
        </blockquote>
      </div>
      <br class="">
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
----------------------------------------------------------------------
Dominik Montada                   Email: <a class="moz-txt-link-abbreviated" href="mailto:dominik.montada@hightec-rt.com">dominik.montada@hightec-rt.com</a>
HighTec EDV-Systeme GmbH          Phone: +49 681 92613 19
Europaallee 19                    Fax:   +49-681-92613-26
D-66113 Saarbrücken               WWW: <a class="moz-txt-link-freetext" href="http://www.hightec-rt.com/">http://www.hightec-rt.com</a>

Managing Director: Vera Strothmann
Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient please notify the sender immediately
and destroy this e-mail. Any unauthorised copying, disclosure or
distribution of the material in this e-mail is strictly forbidden.
--- </pre>
  </div>

</div></blockquote></div><br class=""></body></html>