<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Oct 9, 2020, at 12:07 AM, Dominik Montada <<a href="mailto:dominik.montada@hightec-rt.com" class="">dominik.montada@hightec-rt.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
<div class=""><p class="">Hi Quentin,<br class="">
</p>
<div class="moz-cite-prefix">Am 08.10.20 um 21:17 schrieb Quentin
Colombet:<br class="">
</div>
<blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
Hi Dominik,<br class="">
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">On Oct 8, 2020, at 5:03 AM, Dominik Montada <<a href="mailto:dominik.montada@hightec-rt.com" class="" moz-do-not-send="true">dominik.montada@hightec-rt.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8" class="">
<div class=""><p class="">Hi Quentin,</p><p class="">thanks for picking up the conversation!<br class="">
</p>
<div class=""><p class="">> I think we should step back and check
what we want before investing any time in some
rewrite.</p><p class="">That is a very fair point and I might have
been getting ahead of myself in my last email.<br class="">
What I would like to see from RegBankSelect is to
produce the mapping with the overall lowest cost.
Keeping track of all different combinations of
mappings will certainly be non-trivial however, so I
wonder if there is a smart way to do this without
spending too much compilation time. <br class="">
</p><p class="">Ideally for instructions with no operands
(like G_CONSTANT) it could also check whether a
cross-bank copy is actually worth it or if it would be
more beneficial to simply rematerialize the
instruction on the required bank. For such
instructions this information should already be
available as part of the cost-modelling in
RegBankSelect: we could simply compare the cost of a
mapping on the required bank vs. the cost of a
cross-bank copy.<br class="">
</p><p class="">Would you see this as a valid direction for
RegBankSelect?</p>
</div>
</div>
</div>
</blockquote>
Yes, I think this is a valid direction. I actually think we
shouldn’t restrict ourselves to instructions with no operands.
We could always duplicate next to the original instruction,
i.e., we wouldn’t have any issue materializing the arguments.</div>
</blockquote>
Sure, I guess that would also work as long we don't have to
introduce copies for the operands. Theoretically this could even try
to duplicate the operands if overall it's still cheaper than a
cross-bank copy but that is probably some pandoras box just waiting
to be opened :)
</div></div></blockquote><div><br class=""></div><div>Hehe!</div><div>Ideally we would compute the costs globally and do the code transformation only once. But I am guessing this is wishful thinking.</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
<div class="">The one thing that needs work is the cost model.</div>
</blockquote><p class="">What would you say does the cost model need in its current state?</p><div class=""><br class=""></div></div></div></blockquote><div><br class=""></div>I don’t remember how everything works so take it with a grain of salt.</div><div>I’d say we would need to take into account how much it would cost to duplicate the definitions and compute the repairing cost with those duplications.</div><div>The thing is it may become compile time intensive pretty quickly if we want to consider all the possibilities.</div><div><br class=""></div><div>As a starter we could focus on the best mapping for the current instruction and try to match the desired regbanks from each operand and only evaluate that cost against just plain repairing.</div><div><br class=""></div><div>In any case, I think it would require a non-trivial amount of work.</div><div><blockquote type="cite" class=""><div class=""><div class=""><p class="">By the way, does this have anything to do with what Matt was
talking about during the round table? Unfortunately I don't
remember exactly what his issues with RegBankSelect were but I'd
be interested to know whether what we talked about would also
benefit him.</p><div class=""><br class=""></div></div></div></blockquote><div><br class=""></div>Partially. IIRC Matt’s problem was also how we apply the mapping and how, right now, we have to abuse the observer mechanism to do what we want.<br class=""><br class=""><blockquote type="cite" class=""><div class=""><div class=""><p class="">I do remember that AMDGPU is apparently doing something very
different compared to AArch64 for example and I'd be interested to
know the reasoning behind the different approaches.<br class="">
</p><p class="">Cheers,</p><p class="">Dominik<br class="">
</p>
<blockquote type="cite" cite="mid:2242BC49-DB0F-4F67-B7F4-A8615478D601@apple.com" class="">
<div class=""><br class="">
</div>
<div class="">Cheers,</div>
<div class="">-Quentin</div>
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">
<div class=""><p class="">Best regards,</p><p class="">Dominik<br class="">
</p>
</div>
<div class=""><br class="">
</div>
<div class="moz-cite-prefix">Am 07.10.20 um 19:47 schrieb
Quentin Colombet:<br class="">
</div>
<blockquote type="cite" cite="mid:6932924D-B430-4259-9176-6A4BE77D64B0@apple.com" class="">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8" class="">
Hi Dominik,
<div class=""><br class="">
</div>
<div class="">Thanks for sending this!<br class="">
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">On Oct 7, 2020, at 5:21 AM, Dominik
Montada via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi all,<br class="">
<br class="">
this is the second email for the round table
follow-up, this time regarding the issues
around the greedy RegBankSelect and
alternative mappings.<br class="">
<br class="">
The issue I brought up was that because
RegBankSelect goes top-down, it never looks at
all available mappings for the operands when
considering which of the mappings to apply to
the current instruction. In our architecture
we have one register bank dedicated to
pointers and another one for anything else. We
often see code where we have a G_PTR_ADD with
a constant. Since the constant is not a
pointer, we put it on the other register bank.
We could put it on the address regbank and do
provide alternative mappings for that, but
since the greedy algorithm doesn't actually
check the usage of the constant, it is always
put on the other bank.<br class="">
</div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">The intent behind the greedy algorithm
was for the mapping to look at X instructions
ahead, depending on the optimization level, when
assigning one instruction.</div>
<div class="">Right now the window is simply 1
instruction and the code is not structured to
allow to have more than one.</div>
<div class=""><br class="">
</div>
<div class="">As we gain more insights on what we
want RegBankSelect to do, it makes sense to
redesign it. </div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div class=""><br class="">
When RegBankSelect then sees the G_PTR_ADD and
sees that one of its inputs is on the other
register bank already, it then inserts a
costly cross-bank copy instead of checking if
that operand has any alternative mappings
which would make the overall mapping for the
current instruction cheaper.<br class="">
</div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">The idea is when a mapping is done,
that means it was the best one at the time of the
decision (greedy), thus we don’t challenge that.</div>
<div class="">“Reverting” an already set mapping is
not that simple because yes for this particular
use we are inserting costly cross copies, but
there are no guarantees that replacing the mapping
of the definition will not insert even more costly
copies for the other uses.</div>
<div class=""><br class="">
</div>
<div class="">E.g., consider:</div>
<div class="">```</div>
<div class=""><font class="" face="Menlo">A = def
<— RBS starts here</font></div>
<div class=""><font class="" face="Menlo">= useFP A</font></div>
<div class=""><font class="" face="Menlo">= useInt A</font></div>
<div class=""><font class="" face="Menlo">= useInt A</font></div>
<div class="">```</div>
<div class=""><br class="">
</div>
<div class="">Let’s assume that greedy works the way
it is intended. I.e., it assigns A to the int
register bank because there are 2 such uses vs.
only 1 fp bank use:</div>
<div class="">
<div class="">```</div>
<div class=""><font class="" face="Menlo">A<int>
= def</font></div>
<div class=""><font class="" face="Menlo">= useFP
A<int> <— Next, RBS looks at this one</font></div>
<div class=""><font class="" face="Menlo">= useInt
A<int></font></div>
<div class=""><font class="" face="Menlo">= useInt
A<int></font></div>
<div class="">```</div>
</div>
<div class=""><br class="">
</div>
Now the regbank for useFP is not right and has to be
repaired. Right now, we will insert the costly cross
copy for that use:</div>
<div class="">
<div class="">
<div class="">```</div>
<div class=""><font class="" face="Menlo">A<int>
= def</font></div>
<div class=""><font class="" face="Menlo">AFP<fp>
= cross_copy A<int></font></div>
<div class=""><font class="" face="Menlo">= useFP
AFP<fp></font></div>
<div class=""><font class="" face="Menlo">= useInt
A<int></font></div>
<div class=""><font class="" face="Menlo">= useInt
A<int></font></div>
<div class="">```</div>
<div class=""><br class="">
</div>
<div class="">Now, if we were to change the
definition of A to avoid this copy we would
create two costly copy for the useInt. Actually,
another question is what would we do when we
look at the first useInt? Is this use allowed to
change the definition of the instruction again?
Do we duplicate the definition? Etc..</div>
<div class="">
<div class="">```</div>
<div class=""><font class="" face="Menlo">A<fp>
= def <— reassign</font></div>
<div class=""><span style="font-family: Menlo;" class="">= useFP AFP<fp></span></div>
<div class=""><font class="" face="Menlo">AInt1<int>
= cross_copy A<fp></font></div>
<div class=""><font class="" face="Menlo">=
useInt AInt1<int></font></div>
<div class=""><font class="" face="Menlo">AInt2<int>
= cross_copy A<fp></font></div>
<div class=""><font class="" face="Menlo">=
useInt AInt2<int></font></div>
<div class="">```</div>
</div>
<div class=""><br class="">
</div>
<div class="">Bottom line, if we allow to modify
the assignments of the definition, the decision
making is not local anymore and in particular
may require to add repairing code all over the
place. As a result the cost model becomes much
more complicated.</div>
<div class=""><br class="">
</div>
</div>
<blockquote type="cite" class="">
<div class="">
<div class=""><br class="">
Matt suggested that RegBankSelect should
probably go bottom-up instead and I agree with
him. I don't think there is a particular
reason why RegBankSelect necessarily has to go
top-down.<br class="">
</div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">The rationale for going top-down is
that when you reach an instruction, all the
operands are assigned so you know what would be
the cost of repairing.</div>
<div class="">You could said that the problem is the
same for definitions when going bottom-up.
However, this is likely to be more problematic
because usually you have fewer definitions than
arguments on each individual instruction,
therefore there is more guess work going on (e.g.,
top-down, you assume a cost for 1 definition,
going bottom-up you have to assume a cost for 2
arguments or more precisely you would have to
track a window of X instructions for 2 arguments
instead of 1 definition.)</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div class=""><br class="">
I'm not too familiar with the implementation
of RegBankSelect. Would it be a big effort to
make it work bottom-up instead? I'm guessing
one of the biggest areas would be the check
whether a cross-bank copy is needed as well as
calculating the overall cost for alternative
mappings as now all usages of the current
instruction would have to be checked instead
of the much more limited operands. How big of
an impact would this have?<br class="">
</div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">It’s been a while since I looked at
the implementation but I would expect this to be
significant.</div>
<div class=""><br class="">
</div>
<div class="">I think we should step back and check
what we want before investing any time in some
rewrite. For instance, I don’t see what bottom-up
fundamentally gives us. It seems like a workaround
to me.</div>
<div class="">That said, it would work either way!</div>
<div class=""><br class="">
</div>
<div class="">Cheers,</div>
<div class="">-Quentin</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div class=""><br class="">
Cheers,<br class="">
<br class="">
Dominik<br class="">
<br class="">
_______________________________________________<br class="">
LLVM Developers mailing list<br class="">
<a href="mailto:llvm-dev@lists.llvm.org" class="" moz-do-not-send="true">llvm-dev@lists.llvm.org</a><br class="">
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
----------------------------------------------------------------------
Dominik Montada Email: <a class="moz-txt-link-abbreviated" href="mailto:dominik.montada@hightec-rt.com" moz-do-not-send="true">dominik.montada@hightec-rt.com</a>
HighTec EDV-Systeme GmbH Phone: +49 681 92613 19
Europaallee 19 Fax: +49-681-92613-26
D-66113 Saarbrücken WWW: <a class="moz-txt-link-freetext" href="http://www.hightec-rt.com/" moz-do-not-send="true">http://www.hightec-rt.com</a>
Managing Director: Vera Strothmann
Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient please notify the sender immediately
and destroy this e-mail. Any unauthorised copying, disclosure or
distribution of the material in this e-mail is strictly forbidden.
--- </pre>
</div>
</div>
</blockquote>
</div>
<br class="">
</blockquote>
<pre class="moz-signature" cols="72">--
----------------------------------------------------------------------
Dominik Montada Email: <a class="moz-txt-link-abbreviated" href="mailto:dominik.montada@hightec-rt.com">dominik.montada@hightec-rt.com</a>
HighTec EDV-Systeme GmbH Phone: +49 681 92613 19
Europaallee 19 Fax: +49-681-92613-26
D-66113 Saarbrücken WWW: <a class="moz-txt-link-freetext" href="http://www.hightec-rt.com/">http://www.hightec-rt.com</a>
Managing Director: Vera Strothmann
Register Court: Saarbrücken, HRB 10445, VAT ID: DE 138344222
This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient please notify the sender immediately
and destroy this e-mail. Any unauthorised copying, disclosure or
distribution of the material in this e-mail is strictly forbidden.
--- </pre>
</div>
</div></blockquote></div><br class=""></body></html>