<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{mso-style-priority:99;
mso-style-link:"Plain Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle17
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle18
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.PlainTextChar
{mso-style-name:"Plain Text Char";
mso-style-priority:99;
mso-style-link:"Plain Text";
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Not to hijack this discussion, but this is related to something I’ve been looking at (and have discussed with Hal and James previously), which is select vs. phi canonicalization.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Currently SimplifyCFG does most of the converting of phis to selects in IR. There is also early and late if-conversion in some backends, but let’s ignore that for now.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>CodeGenPrepare does the reverse (i.e. select -> phi/branch) for some targets and some selects.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>SimplifyCFG uses a target instruction cost model of the speculated instructions to decide whether to do phi -> select conversion. It currently uses a cutoff of 2*Basic if I recall correctly.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I’d like to start a discussion on what the “right” canonical representation of these operations is over the phases of optimization. It seems to me that keeping branches around longer would be better since it would allow global optimizations to only have to worry about phis. It would also create blocks where instructions could be sunk. For example:<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoPlainText> int selects(int a, int b, int c, int d) {<o:p></o:p></p><p class=MsoPlainText> int x1, x2, x3;<o:p></o:p></p><p class=MsoPlainText> <o:p></o:p></p><p class=MsoPlainText> x1 = a / b;<o:p></o:p></p><p class=MsoPlainText> x2 = b / c;<o:p></o:p></p><p class=MsoPlainText> x3 = c / d;<o:p></o:p></p><p class=MsoPlainText> if (a < b) {<o:p></o:p></p><p class=MsoPlainText> x1 = 0;<o:p></o:p></p><p class=MsoPlainText> x2 = 0;<o:p></o:p></p><p class=MsoPlainText> x3 = 0;<o:p></o:p></p><p class=MsoPlainText> }<o:p></o:p></p><p class=MsoPlainText> <o:p></o:p></p><p class=MsoPlainText> return x1 + x2 + x3;<o:p></o:p></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>If we don’t convert the above to selects, then the CodeSinking pass is able to sink the divides into an else block that it creates. (As an aside this brings up the question of canonical location of instructions w.r.t. sinking/forwarding, which I haven’t seen discussed either).<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I believe the downside to keeping phis/branches in the IR (someone please correct me if I’m wrong here), is that we break up the scope of later non-global passes. I don’t have a feel for how big of a problem this is or how hard it would be to fix by e.g. extending these passes to work on single-entry single-exit regions or even superblocks.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Any thoughts on this?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><div style='mso-element:para-border-div;border:dashed #2F6FAB 1.0pt;padding:12.0pt 12.0pt 12.0pt 12.0pt;background:#F9F9F9'><p class=MsoNormal style='line-height:15.6pt;background:#F9F9F9;border:none;padding:0in'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>--<o:p></o:p></span></p><p class=MsoNormal style='line-height:15.6pt;background:#F9F9F9;border:none;padding:0in'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Geoff Berry<o:p></o:p></span></p><p class=MsoNormal style='line-height:15.6pt;background:#F9F9F9;border:none;padding:0in'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Employee of Qualcomm Innovation Center, Inc.<o:p></o:p></span></p><p class=MsoNormal style='line-height:15.6pt;background:#F9F9F9;border:none;padding:0in'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project<o:p></o:p></span></p></div><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Daniel Berlin [mailto:dberlin@dberlin.org] <br><b>Sent:</b> Wednesday, March 25, 2015 10:25 AM<br><b>To:</b> reviews+D8120+public+d2c9388bc8837c4c@reviews.llvm.org<br><b>Cc:</b> tilmann.scheller@googlemail.com; Nick Lewycky; David Majnemer; Hal Finkel; Philip Reames; James Molloy; mssimpso@codeaurora.org; gberry@codeaurora.org; Commit Messages and Patches for LLVM<br><b>Subject:</b> Re: [PATCH] [GVN] Eliminate redundant loads whose addresses are dependent on the result of a select instruction.<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>Out of curiosity, at what phase do you see these redundancies?<o:p></o:p></p><div><p class=MsoNormal>GCC used to canonicalize on the CFG version and do select formation late specifically to address some of these issues.<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>On Wed, Mar 25, 2015 at 7:20 AM, Chad Rosier <<a href="mailto:mcrosier@codeaurora.org" target="_blank">mcrosier@codeaurora.org</a>> wrote:<o:p></o:p></p><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'><p class=MsoNormal>I've abandoned this patch (originally abandoned on March 16th). Overall, I saw no significant performance improvements across SPEC2K/SPEC2K6 and after further testing there was one significant regression (i.e., 6% on spec2000/vpr), which was the workload I was targeting. While the patch does remove the redundant load, that redundancy was replaced by another set of redundant instructions, fcmp/csel. IIRC, csel instructions are bad for A57 devices in general.<br><br>I agree that a more general solution should be considered and if the new PRE pass can enable that solution, I'm in no rush to push this patch forward.<br><br>Regardless, I do appreciate everyones feedback!<o:p></o:p></p><div><div><p class=MsoNormal style='margin-bottom:12.0pt'><br><br><a href="http://reviews.llvm.org/D8120" target="_blank">http://reviews.llvm.org/D8120</a><br><br>EMAIL PREFERENCES<br> <a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br><br><o:p></o:p></p></div></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></body></html>