<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Jul 18, 2012, at 1:40 PM, Chandler Carruth <<a href="mailto:chandlerc@google.com">chandlerc@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">On Wed, Jul 18, 2012 at 1:20 PM, Jakob Stoklund Olesen <span dir="ltr"><<a href="mailto:stoklund@2pi.dk" target="_blank" class="cremed">stoklund@2pi.dk</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div><div class="h5"><div>On Jul 18, 2012, at 1:12 PM, Andrew Trick <<a href="mailto:atrick@apple.com" target="_blank" class="cremed">atrick@apple.com</a>> wrote:</div>
<br><blockquote type="cite"><div style="word-wrap:break-word"><div><div>On Jul 17, 2012, at 5:23 PM, Jakob Stoklund Olesen <<a href="mailto:stoklund@2pi.dk" target="_blank" class="cremed">stoklund@2pi.dk</a>> wrote:</div>
<blockquote type="cite"><span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+/* Allow mul4 to be inlined into wrap_mul4. This actually enables further</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+ * optimizations. */</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+__attribute__((__noinline__))</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+void wrap_mul4(double *Out, const double A[4][4], const double B[4][4])</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+{</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+ mul4(Out, A, B);</span><br style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
<span style="font-family:Helvetica;font-size:medium;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:-webkit-auto;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;display:inline!important;float:none">+}</span></blockquote>
</div><br><div>This is not obvious to me. Can you explain?</div></div></blockquote><div><br></div></div></div><div>First mul4() is optimized. Then it is inlined into wrap_mul4 and optimized again.</div><div><br></div><div>
The second pass somehow tickles SROA in a way that causes it to turn the whole double[16] array into an i1024.</div><div><br></div><div>That doesn't happen without the extra wrapper function. See also <a href="http://llvm.org/pr13392" target="_blank" class="cremed">http://llvm.org/pr13392</a></div>
</div></div></blockquote><div><br></div><div>Ok, I got confused by the comment and your explanation at first, but reading the bug: with this wrapper, an extra optimization occurs that actually turns out to hurt ARM codegen. That's what the PR is about, making this "extra" optimization not actual hurt ARM codegen, right?</div>
<div><br></div><div>So what is the comment in the code about? Is this extra optimization actually helping on other platforms, making the wrapper a positive effect on performance? Or is the comment just inaccurate?</div><div>
<br></div><div>If adding this wrapper actually improves performance (with a fixed PR13392 or on a platform where that doesn't happen), then *that* is the inliner bug I'd like to know about. =]</div></div></div>
</blockquote></div><br><div>FWIW, I ran into exactly the same problem a few weeks back, totally different code. SROA has some interesting pass order problems. Subsequent optimizations may do store-load forwarding exposing more SROA next time 'round.</div><div><br></div><div>-Andy</div></body></html>