<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:Helvetica;
        panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.apple-converted-space
        {mso-style-name:apple-converted-space;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Very interesting, thanks for the analysis and further reduction! Over-aligning the struct seems like a simple workaround that would have little impact elsewhere.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> David Zarzycki <dave@znu.io> <br>
<b>Sent:</b> Monday, October 28, 2019 11:23 PM<br>
<b>To:</b> Seth Brenith <Seth.Brenith@microsoft.com><br>
<b>Cc:</b> llvm-dev@lists.llvm.org<br>
<b>Subject:</b> [EXTERNAL] Re: [llvm-dev] unnecessary reload of 8-byte struct on i386<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This just looks like a temporary stack variable wasn’t properly eliminated because the compiler modeled `Operand` internally as an “i64” which i386 doesn’t natively support. A further reduction, with notes about changes that sidestep the
 bug:<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgodbolt.org%2Fz%2FzcCguv&data=02%7C01%7CSeth.Brenith%40microsoft.com%7C9a7bff8d4a864a3bd7a908d75c3880fb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637079270882127747&sdata=Mlqw0Jmr9wsh4JYWLm0kX%2B4pj%2FjFR8Ef8YJSCYZOpDg%3D&reserved=0">https://godbolt.org/z/zcCguv</a><o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">On Oct 26, 2019, at 12:28 AM, Seth Brenith via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">Hello folks,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">I’ve recently been looking at the generated code for a few functions in Chromium while investigating crashes, and I came across a curious pattern. A smallish repro case is available at<span class="apple-converted-space"> </span><a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgodbolt.org%2Fz%2FDsu1WI&data=02%7C01%7CSeth.Brenith%40microsoft.com%7C9a7bff8d4a864a3bd7a908d75c3880fb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637079270882137742&sdata=OeSk0j2ISpza%2F%2FVUWDsmUcGUP8MX3WRwkAtyIyO3YxQ%3D&reserved=0"><span style="color:#954F72">https://godbolt.org/z/Dsu1WI</span></a><span class="apple-converted-space"> </span>.
 In that case, the function Assembler::emit_arith receives a struct (Operand) by value and passes it by value to another function. That struct is 8 bytes long, so the -O3 generated code uses movsd to copy it up the stack. However, we end up with some loads
 that aren’t needed, as in the following chunk:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="color:blue">movsd</span><span class="apple-converted-space"> </span><span style="color:#4864AA">xmm0</span>,<span class="apple-converted-space"> </span><span style="color:teal">qword</span><span class="apple-converted-space"> </span><span style="color:teal">ptr</span><span class="apple-converted-space"> </span>[<span style="color:#4864AA">ecx</span>]<span class="apple-converted-space"> </span><span style="color:green">#
 xmm0 = mem[0],zero</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="color:blue">mov</span><span class="apple-converted-space"> </span><span style="color:teal">dword</span><span class="apple-converted-space"> </span><span style="color:teal">ptr</span><span class="apple-converted-space"> </span>[<span style="color:#4864AA">esp</span><span class="apple-converted-space"> </span>+<span class="apple-converted-space"> </span><span style="color:#09885A">24</span>],<span class="apple-converted-space"> </span><span style="color:#4864AA">edx</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="color:blue">movsd</span><span class="apple-converted-space"> </span><span style="color:teal">qword</span><span class="apple-converted-space"> </span><span style="color:teal">ptr</span><span class="apple-converted-space"> </span>[<span style="color:#4864AA">esp</span><span class="apple-converted-space"> </span>+<span class="apple-converted-space"> </span><span style="color:#09885A">40</span>],<span class="apple-converted-space"> </span><span style="color:#4864AA">xmm0</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="color:blue">movsd</span><span class="apple-converted-space"> </span><span style="color:#4864AA">xmm0</span>,<span class="apple-converted-space"> </span><span style="color:teal">qword</span><span class="apple-converted-space"> </span><span style="color:teal">ptr</span><span class="apple-converted-space"> </span>[<span style="color:#4864AA">esp</span><span class="apple-converted-space"> </span>+<span class="apple-converted-space"> </span><span style="color:#09885A">40</span>]<span class="apple-converted-space"> </span><span style="color:green">#
 xmm0 = mem[0],zero</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="color:blue">movsd</span><span class="apple-converted-space"> </span><span style="color:teal">qword</span><span class="apple-converted-space"> </span><span style="color:teal">ptr</span><span class="apple-converted-space"> </span>[<span style="color:#4864AA">esp</span><span class="apple-converted-space"> </span>+<span class="apple-converted-space"> </span><span style="color:#09885A">8</span>],<span class="apple-converted-space"> </span><span style="color:#4864AA">xmm0</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">As far as I can tell, the fourth line has no effect. On its own, that seems like a small missed opportunity for optimization. However, this sequence of instructions also appears to trigger a hardware bug on a small fraction of devices which
 sometimes end up storing zero at esp+8. A more in-depth discussion of that issue can be found here:<span class="apple-converted-space"> </span><a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.chromium.org%2Fp%2Fv8%2Fissues%2Fdetail%3Fid%3D9774&data=02%7C01%7CSeth.Brenith%40microsoft.com%7C9a7bff8d4a864a3bd7a908d75c3880fb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637079270882137742&sdata=ujiic5Tb0%2F%2BsHM7sH6JXCJlpsQSKD6txl5g3HtEYOuo%3D&reserved=0"><span style="color:#954F72">https://bugs.chromium.org/p/v8/issues/detail?id=9774</span></a><span class="apple-converted-space"> </span>.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">I’m hoping that getting rid of the second load in the sequence above would appease these misbehaving machines (though of course I don’t know that it would), as well as making the code a little smaller for everybody else. Does that sound
 like a reasonable idea? Would LLVM be interested in a patch related to eliminating reloads like this? Does anybody have advice about where I should start looking, or any reasons it would be very hard to achieve the result I’m hoping for?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Seth<o:p></o:p></p>
</div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif">_______________________________________________<br>
LLVM Developers mailing list<br>
</span><a href="mailto:llvm-dev@lists.llvm.org"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif;color:#954F72">llvm-dev@lists.llvm.org</span></a><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif"><br>
</span><a href="https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=02%7C01%7CSeth.Brenith%40microsoft.com%7C9a7bff8d4a864a3bd7a908d75c3880fb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637079270882137742&sdata=CAD9ST%2FRFrzyfx3hOLCNvaGX%2BXNwavEpy788wFfbZ6Q%3D&reserved=0"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif;color:#954F72">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</span></a><o:p></o:p></p>
</div>
</blockquote>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
</body>
</html>