<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 11/15/2016 4:22 PM, Pete Couperus
via llvm-dev wrote:<br>
</div>
<blockquote
cite="mid:51BA4D8BA55CE24E8F0B52E920FF5E7215E33750@us01wembx1.internal.synopsys.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Context: We have a backend where v32i1 is a
Legal type, but the storage for v32i1 is not 32-bits/uses a
different instruction sequence.<o:p></o:p></p>
<p class="MsoNormal">We ran into an issue because
combineLoadToOperationType changed v32i1 loads into i32 loads,
so a sequence like:<o:p></o:p></p>
<p class="MsoNormal">define void @bits(<32 x i1>* %A,
<32 x i1>* %B) {<o:p></o:p></p>
<p class="MsoNormal"> %a = load <32 x i1>, <32 x
i1>* %A<o:p></o:p></p>
<p class="MsoNormal"> store <32 x i1> %a, <32 x
i1>* %B<o:p></o:p></p>
<p class="MsoNormal"> ret void<o:p></o:p></p>
<p class="MsoNormal">}<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Is transformed to:<o:p></o:p></p>
<p class="MsoNormal">define void @bits(<32 x i1>* %A,
<32 x i1>* %B) {<o:p></o:p></p>
<p class="MsoNormal"> %1 = bitcast <32 x i1>* %A to i32*<o:p></o:p></p>
<p class="MsoNormal"> %a1 = load i32, i32* %1, align 4<o:p></o:p></p>
<p class="MsoNormal"> %2 = bitcast <32 x i1>* %B to i32*<o:p></o:p></p>
<p class="MsoNormal"> store i32 %a1, i32* %2, align 4<o:p></o:p></p>
<p class="MsoNormal"> ret void<o:p></o:p></p>
<p class="MsoNormal">}<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This looks to be intentional. <o:p></o:p></p>
<p class="MsoNormal">Is there a way to specify in the
data-layout that v32i1 storage is not 32-bits?</p>
</div>
</blockquote>
<br>
No, not at the moment. You could propose something, but you'd
probably have a hard time convincing anyone it's necessary; nobody
has cared about this for a very long time.<br>
<br>
<blockquote
cite="mid:51BA4D8BA55CE24E8F0B52E920FF5E7215E33750@us01wembx1.internal.synopsys.com"
type="cite">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">Absent that, is there any other reliable
way to retain the original vector loads/store without just
disabling this part of InstCombine?</p>
</div>
</blockquote>
<br>
No, and you'll run into other problems (e.g. alias analysis) if the
data layout lies about the size of a load or store.<br>
<br>
<blockquote
cite="mid:51BA4D8BA55CE24E8F0B52E920FF5E7215E33750@us01wembx1.internal.synopsys.com"
type="cite">
<div class="WordSection1">
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal">Or is it the backend’s responsibility to
try and work with this?</p>
</div>
</blockquote>
<br>
Where are these loads coming from? x86 without AVX512 doesn't have
any convenient way generate code for a <32 x i1> store, but it
doesn't matter because frontends don't generate <N x i1> loads
and stores.<br>
<br>
If you have a frontend which is generating loads and stores like
this, you could probably change it to use some other sequence (like
a platform-specific intrinsic, or some sequence involving
sext/trunc).<br>
<br>
-Eli<br>
<pre class="moz-signature" cols="72">--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>
</body>
</html>