<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><base href="x-msg://1411/"><style><!--
/* Font Definitions */
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:SimSun;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:"\@SimSun";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
span.apple-converted-space
        {mso-style-name:apple-converted-space;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1106340423;
        mso-list-type:hybrid;
        mso-list-template-ids:475198538 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hi Andy,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Actually, if we just add hooks that preserves the existing behavior, <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>It is not difficult. For example, <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>For case one, we can define one function like<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>  virtual const SCEV* getTargetPreferredWinnerReg(const SCEV*& ScaledReg,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>           SmallVector<const SCEV *, 4>& BaseRegs, GlobalValue*& BaseGV) const;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>In NarrowSearchSpaceByPickingWinnerRegs, we can preserves the winner<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>reg from target and winner reg from the original algorithm if this function <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>returns NULL, it is just like before <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>For case two, we can define a general cost from TTI function, like<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>  virtual int getLSRFormulaCost(const unsigned NumRegs,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                            const unsigned AddRecCost, const unsigned NumIVMuls,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                            const unsigned NumBaseAdds, const unsigned ImmCost,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                            const unsigned SetupCost) const;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Then we do something like <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>  int thisCost = TTI->getLSRFormulaCost(NumRegs, AddRecCost, NumIVMuls,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                                           NumBaseAdds, ImmCost, SetupCost);<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>  if (thisCost >= 0) {<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>    int otherCost = TTI->getLSRFormulaCost(Other.NumRegs, Other.AddRecCost,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                                            Other.NumIVMuls, Other.NumBaseAdds,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>                                            Other.ImmCost, Other.SetupCost);<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>    if (otherCost >= 0)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>      return thisCost < otherCost;<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>  }<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>In bool Cost::operator<(const Cost &Other) const <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>We could have more decision from target backend.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>However, from the problem I am dealing with, which has a lot of branches in multiple level<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Loop nests. LSR is still not able to perform the best because <o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>1.<span style='font:7.0pt "Times New Roman"'>       </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>LSR is not control flow sensitive. It treats all USE equally, which doesn’t care which <o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>USE is on critical path and which USE is on a branch. Without these kind of information,<o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>We cannot predict AddRec precisely because we only can assume all USEs can be post<o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Increment or all not.<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>2.<span style='font:7.0pt "Times New Roman"'>       </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>The most occurred winner regs pruning may not be the best approach. Because target<o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>may prefer certain regs than others, even some registers do occur more. Specially, <o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>register with small computation is more likely to occur in formulas. However, register<o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>with small computation may not always be a best choice if the content in register are<o:p></o:p></span></p><p class=MsoListParagraph><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>loop invariant.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Therefore,  We may need a systemic agreement or plan to address the existing LSR problems. I<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>would like to ask if any party has any improvement plan about LSR? So we can come together<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>to have an unified solution to handle all known problem in one round?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanks,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Yin <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Andrew Trick [mailto:atrick@apple.com] <br><b>Sent:</b> Thursday, March 14, 2013 9:42 AM<br><b>To:</b> Yin Ma<br><b>Cc:</b> llvmdev@cs.uiuc.edu<br><b>Subject:</b> Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Mar 13, 2013, at 4:37 PM, Yin Ma <<a href="mailto:yinma@codeaurora.org">yinma@codeaurora.org</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><br><br><o:p></o:p></p><div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Hi All,<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>In the target I am working, we comes cross a situation that the loop strength reduction<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>could deliver a better result but currently not, because<o:p></o:p></span></p></div><div style='margin-left:.5in'><p class=MsoNormal style='text-indent:-.25in'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>1.</span><span style='font-size:7.0pt'>      <span class=apple-converted-space> </span></span><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>the algorithm narrows search space by winner registers without considering<o:p></o:p></span></p></div><div style='margin-left:.5in'><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs)<o:p></o:p></span></p></div><div style='margin-left:.5in'><p class=MsoNormal style='text-indent:-.25in'><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>2.</span><span style='font-size:7.0pt'>      <span class=apple-converted-space> </span></span><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Cost comparison solely favors the number register without considering other<o:p></o:p></span></p></div><div style='margin-left:.5in'><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Impacts.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>For the case one,<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>NarrowSearchSpaceByPickingWinnerRegs filters by most occurred registers.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>ld(basereg, immediate) is a target preferred addressing mode. However, it may<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>be deleted because basereg is very likely not to be the most occurred register<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>because the less opportunity in a combination.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>For the case two, by observing the cost comparison equation<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>bool Cost::operator<(const Cost &Other) const {<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (NumRegs != Other.NumRegs)                            return NumRegs < Other.NumRegs;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (AddRecCost != Other.AddRecCost)                  return AddRecCost < Other.AddRecCost;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (NumIVMuls != Other.NumIVMuls)                   return NumIVMuls < Other.NumIVMuls;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (NumBaseAdds != Other.NumBaseAdds)       return NumBaseAdds < Other.NumBaseAdds;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (ImmCost != Other.ImmCost)                               return ImmCost < Other.ImmCost;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  if (SetupCost != Other.SetupCost)                         return SetupCost < Other.SetupCost;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>  return false;<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>}<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>If we have a case to compare<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm cost, plus 4 setup cost.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm cost, plus 2 setup cost.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>The current mode will select 4 regs case even there are 14 more base adds. And base<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Adds matters in our targets.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'> <o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>So I think the current LSR should be pushing more decision into target dependent backend.<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>Like calling new functions in TargetTransformInfo. At least, in narrow search space and cost<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>comparison phase, or more in cost rating phase. LSR can be tightly cooped with the target<o:p></o:p></span></p></div><div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif"'>attributes in order to get the most beneficial result.<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Yes. LSR decisions are tightly coupled with the target architecture and potentially the subtarget microarcthitecture. As you figured out, the way to handle it is to communicate more information to LSR through TTI. It's easy to do this to solve individual benchmarks on your target. It's hard to know if you have a general solution that works across targets. But if you can add hooks in a way that preserves existing behavior on other targets it shouldn't be a problem. We want to design general hooks, but leave it up to someone doing the benchmarking to tune them for a particular target.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>-Andy<o:p></o:p></p></div></div></body></html>