<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Menlo;
        panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;
        font-family:"Calibri",sans-serif;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>After trying various means of undoing the canonicalization at the SROA pass, I’m thinking an easier/better approach is to simply be much more conservative when doing the original canonicalization in the first place. My proposal is<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Courier New";color:#1F497D'>if (std::all_of(LI.user_begin(), LI.user_end(), [&LI](User *U) {<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Courier New";color:#1F497D'>          auto *SI = dyn_cast<StoreInst>(U);<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Courier New";color:#1F497D'>          return SI && SI->getPointerOperand() != &LI &&<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Courier New";color:#1F497D'>                 SI->getPointerOperand()->hasOneUse();<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Courier New";color:#1F497D'>        })) {<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>where the addition of checking to ensure the StoreInst’s pointer operand is only used once. What I’ve found is that when it is used multiple times, it is usually due to the demotion of phis into loads/stores. There are multiple of them because several blocks are using the storage location essentially as a phi. I originally tried to ignore the cases where the loads only stored during SROA. But that really doesn’t work unless you follow all the other places that also load from those same locations, accounting for all the inserted bitcasts. It got way too messy trying to catch all the cases, and it still wasn’t able to undo what needed to be undone. <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I’ve found I can get back most all the regression loss I see by being more conservative at the canonicalization step.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Would this still meet the cases you originally wrote this canonicalization for Chandler?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Daniel<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Chandler Carruth [mailto:chandlerc@google.com] <br><b>Sent:</b> Thursday, April 23, 2015 1:42 PM<br><b>To:</b> Daniel Stewart; Pete Cooper<br><b>Cc:</b> LLVM Developers Mailing List<br><b>Subject:</b> Re: [LLVMdev] RFC: Missing canonicalization in LLVM<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p>FYI, on vacation and then at a conference but will actually look at this on Monday next week.<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>On Thu, Apr 23, 2015, 18:31 Daniel Stewart <<a href="mailto:stewartd@codeaurora.org">stewartd@codeaurora.org</a>> wrote:<o:p></o:p></p><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I think I can solve this by both adding BitCast checks to the loads, in addition to the Stores, and also checking the Stores to ensure they are fed by a Load and that the Load only feeds it. I’ll test this solution some more and try to make a patch. </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Daniel</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> </span><a href="mailto:llvmdev-bounces@cs.uiuc.edu" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>llvmdev-bounces@cs.uiuc.edu</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> [mailto:</span><a href="mailto:llvmdev-bounces@cs.uiuc.edu" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>llvmdev-bounces@cs.uiuc.edu</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>] <b>On Behalf Of </b>Daniel Stewart<br><b>Sent:</b> Thursday, April 23, 2015 9:17 AM</span><o:p></o:p></p></div></div></div></div><div><div><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><br><b>To:</b> 'Pete Cooper'<br><b>Cc:</b> 'LLVM Developers Mailing List'<br><b>Subject:</b> Re: [LLVMdev] RFC: Missing canonicalization in LLVM</span><o:p></o:p></p></div></div></div></div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Thanks for the reply Pete. Unfortunately, I don’t think it is going to be as simple as ignoring those loads which only store. In findCommonType(), only one alloca is passed in at a time. So, while you could find those cases where that alloca was loaded from and stored elsewhere, you can’t find those places that store to that alloca from somewhere else (at least not easily that I can see). </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>So in my particular example, I could catch the case of only load -> store. However, there are other stores that use the alloca address, but it is not readily apparent if they come directly & only from a load. </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Still trying to figure out the best way forward.</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Daniel</span><o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Pete Cooper [</span><a href="mailto:peter_cooper@apple.com" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>mailto:peter_cooper@apple.com</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>] <br><b>Sent:</b> Tuesday, April 21, 2015 2:00 PM<br><b>To:</b> Daniel Stewart<br><b>Cc:</b> LLVM Developers Mailing List; Chandler Carruth<br><b>Subject:</b> Re: [LLVMdev] RFC: Missing canonicalization in LLVM</span><o:p></o:p></p></div></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Hi Daniel<o:p></o:p></p><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Thanks for the excellent breakdown of whats going on here.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Earlier in the thread on this I made this comment:<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><blockquote style='margin-left:30.0pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>"The first thing that springs to mind is that I don’t trust the backend to get this right.  I don’t think it will understand when an i32 load/store would have been preferable to a float one or vice versa.  I have no evidence of this, but given how strongly typed tablegen is, I don’t think it can make a good choice here.<o:p></o:p></p></div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div></div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>So I think we probably need to teach the backend how to undo whatever canonical form we choose if it has a reason to”<o:p></o:p></p></div></div></blockquote><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Without seeing the machine instructions, its hard to be 100% certain, but the case you’ve found may be simple enough that the backend can actually fix this. However, such a fixup would be quite target specific (such a target would need different register classes for integers and doubles in this case), and we’d need such a pass for all targets which isn’t ideal.<o:p></o:p></p><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>So i wouldn’t rule out a backend solution, but i have a preference for your suggestion to improve SROA.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>In this particular case, it makes sense for SROA to do effectively the same analysis InstCombine did here and work out when a load is just raw data vs when its data is used as a specific type.  The relevant piece of InstCombine is this:<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#008400'>// Try to canonicalize loads which are only ever stored to operate over</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>  </span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#008400'>// integers instead of any other type. We only do this when the loaded type</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>  </span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#008400'>// is sized and has a size exactly the same as its store size and the store</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>  </span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#008400'>// size is a legal integer type.</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>  <span style='color:#BB2CA2'>if</span> (!Ty-><span style='color:#31595D'>isIntegerTy</span>() && Ty-><span style='color:#31595D'>isSized</span>() &&</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>      DL.</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#31595D'>isLegalInteger</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>(DL.</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#31595D'>getTypeStoreSizeInBits</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>(Ty)) &&</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>      DL.</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#31595D'>getTypeStoreSizeInBits</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>(Ty) == DL.</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:#31595D'>getTypeSizeInBits</span><span style='font-size:8.5pt;font-family:"Menlo",serif;color:black'>(Ty)) {</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>    <span style='color:#BB2CA2'>if</span> (<span style='color:#703DAA'>std</span>::<span style='color:#3D1D81'>all_of</span>(LI.<span style='color:#31595D'>user_begin</span>(), LI.<span style='color:#31595D'>user_end</span>(), [&LI](<span style='color:#4F8187'>User</span> *U) {</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>          <span style='color:#BB2CA2'>auto</span> *SI = <span style='color:#31595D'>dyn_cast</span><<span style='color:#4F8187'>StoreInst</span>>(U);</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>          <span style='color:#BB2CA2'>return</span> SI && SI-><span style='color:#31595D'>getPointerOperand</span>() != &LI;</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>        })) {</span><o:p></o:p></p></div></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:8.5pt;font-family:"Menlo",serif'>...</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>After ignoring load/stores which satisfy something like the above code, you can always fallback to the current code of choosing an integer type, so in the common case there won’t be any behavior difference.<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Cheers<o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Pete<o:p></o:p></p><div><div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>On Apr 21, 2015, at 9:18 AM, Daniel Stewart <<a href="mailto:stewartd@codeaurora.org" target="_blank">stewartd@codeaurora.org</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>So this change did indeed have an effect! </span><span style='font-size:11.0pt;font-family:Wingdings;color:#1F497D'>J</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I’m seeing regressions in a number of benchmarks mainly due to a host of extra bitcasts that get introduced. Here’s the problem I’m seeing in a nutshell:</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>1)      There is a Phi with input type double</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>2)      Polly demotes the phi into a load/store of type double</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>3)      InstCombine canonicalizes the load/store to use i64 instead of double</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>4)      SROA removes the load/store & inserts a phi back in, using i64 as the type. Inserts bitcast to get to double.</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>5)      The bitcast sticks around and eventually get translated into FMOVs (for AArch64 at least).</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>The function findCommonType() in SROA.cpp is used to obtain the type that should be used for the new alloca that SROA wants to create. It’s decision process is essentially – if all loads/stores of alloca are the same, use that type; else use the corresponding integer type. This causes bitcasts to be inserted in a number of places, most all of which stick around. </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>I’ve copied a reduced version of an instance of the problem below. I’m looking for comments on what others think is the right solution here. Make SROA more intelligent about picking the type? </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>The code is below with all unnecessary code removed for easy consumption. </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'>Daniel</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>Before </span></u><u><span style='font-size:11.0pt;font-family:"Courier New";color:#212121'>Polly – Prepare code for polly</span></u><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> we have code that looks like:</span></u><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.cond473:                                    ; preds = %while.cond473.outer78, %while.body475</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= phi double [ </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%105</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>, %while.body475 ], [ %p_j_x452.0.ph82, %while.cond473.outer78 ]</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.body475:                                    ; preds = %while.cond473</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %sub480 = fsub fast double %64, </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0</span></b><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%105</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load double* %x485, align 8, !tbaa !25</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>After </span></u><u><span style='font-size:11.0pt;font-family:"Courier New";color:#212121'>Polly – Prepare code for polly</span></u><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> we have:</span></u><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.cond473:                                    ; preds = %while.cond473.outer78, %while.body475</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0.reload</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load double* %p_j_x452.0.reg2mem</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.body475:                                    ; preds = %while.cond473</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %sub480 = fsub fast double %64, </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0.reload</span></b><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%110</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load double* %x485, align 8, !tbaa !25</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  store double </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%110</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>, double* %p_j_x452.0.reg2mem</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>After </span></u><u><span style='font-size:11.0pt;font-family:"Courier New";color:#212121'>Combine redundant instructions</span></u><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> :</span></u><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.cond473:                                    ; preds = %while.cond473.outer78, %while.body475</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0.reload</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load double* %p_j_x452.0.reg2mem, align 8</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.body475:                                    ; preds = %while.cond473</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %sub480 = fsub fast double %74, </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%p_j_x452.0.reload</span></b><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %x485 = getelementptr inbounds %struct.CompAtom* %15, i64 %idxprom482, i32 0, i32 0</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %194 = bitcast double* %x485 to i64*</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%195</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load i64* %194, align 8, !tbaa !25</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %200 = bitcast double* %p_j_x452.0.reg2mem to i64*</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  store i64 </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%195</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>, i64* %200, align 8</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>After </span></u><u><span style='font-size:11.0pt;font-family:"Courier New";color:#212121'>SROA</span></u><u><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> :</span></u><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.cond473:                                    ; preds = %while.cond473.outer78, %while.body475</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %p_j_x452.0.reg2mem.sroa.0.0.p_j_x452.0.reload362 = phi i64 [ %p_j_x452.0.ph73.reg2mem.sroa.0.0.load368, %while.cond473.outer78 ], [ </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%178</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>, %while.body475 ]</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%173</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= bitcast i64 %p_j_x452.0.reg2mem.sroa.0.0.p_j_x452.0.reload362 to double</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>while.body475:                                    ; preds = %while.cond473</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %sub480 = fsub fast double %78, </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#0070C0'>%173</span></b><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %x485 = getelementptr inbounds %struct.CompAtom* %15, i64 %idxprom482, i32 0, i32 0</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  %177 = bitcast double* %x485 to i64*</span><o:p></o:p></p></div><div style='border:none;border-bottom:solid windowtext 1.0pt;padding:0in 0in 1.0pt 0in'><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>  </span><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'>%178</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:red'> </span><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'>= load i64* %177, align 8, !tbaa !25</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#212121'> </span><o:p></o:p></p></div></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D'> </span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> </span><a href="mailto:llvmdev-bounces@cs.uiuc.edu" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:purple'>llvmdev-bounces@cs.uiuc.edu</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'> [</span><a href="mailto:llvmdev-bounces@cs.uiuc.edu" target="_blank"><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;color:purple'>mailto:llvmdev-bounces@cs.uiuc.edu</span></a><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>] <b>On Behalf Of </b>Chandler Carruth<br><b>Sent:</b> Wednesday, January 21, 2015 8:32 PM<br><b>To:</b> Pete Cooper<br><b>Cc:</b> LLVM Developers Mailing List<br><b>Subject:</b> Re: [LLVMdev] RFC: Missing canonicalization in LLVM</span><o:p></o:p></p></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>On Wed, Jan 21, 2015 at 3:06 PM, Pete Cooper <<a href="mailto:peter_cooper@apple.com" target="_blank"><span style='color:purple'>peter_cooper@apple.com</span></a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt'><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'>Sounds good to me.  Integers it is then.<o:p></o:p></p></div></blockquote></div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><br>FYI, thanks, I'm just going to commit this then. It seems we're all in essential agreement. We can revert it and take a more cautious approach if something terrible happens. =]<o:p></o:p></p></div></div></div></div></blockquote></div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'> <o:p></o:p></p></div></div></div></div><p class=MsoNormal>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><o:p></o:p></p></blockquote></div></div></body></html>