<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Hi Matt,</p>
<p>thanks for taking a look, please see
<a class="moz-txt-link-freetext" href="https://bugs.llvm.org//show_bug.cgi?id=32406">https://bugs.llvm.org//show_bug.cgi?id=32406</a>.</p>
<p>/Jonas</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 2017-03-24 15:10, Matthew Simpson
wrote:<br>
</div>
<blockquote
cite="mid:CAN5HOih+Q2H-6Bmv15dDn62skPbAXEDOkEr=sU=GiL76UK75Sg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>Hi Jonas,</div>
<div><br>
</div>
<div>The vectorizers do attempt to type-shrink elements if
possible to pack more data into vectors. It looks like that's
what's happening here. This transformation is cost-modeled,
but there are assumptions made about what InstCombine will be
able to clean up. Would you mind filing a bug with at test
case that we can take a look at?</div>
<div><br>
</div>
<div>-- Matt</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Mar 24, 2017 at 8:25 AM, Jonas
Paulsson via llvm-dev <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p>Hi,<br>
<br>
I have come across a major regression resulting after
SLP vectorization (+18% on SystemZ, just for enabling
SLP). This all relates to one particular very hot loop.<br>
<br>
Scalar code:<br>
%conv252 = zext i16 %110 to i64<br>
%conv254 = zext i16 %111 to i64<br>
%sub255 = sub nsw i64 %conv252, %conv254<br>
... repeated<br>
<br>
SLP output:<br>
%101 = zext <16 x i16> %100 to <16 x i64><br>
%104 = zext <16 x i16> %103 to <16 x i64><br>
%105 = sub nsw <16 x i64> %101, %104<br>
%106 = trunc <16 x i64> %105 to <16 x i32><br>
<i>for each element e 0:15</i><br>
%107 = extractelement <16 x i32> %106, i32 e<br>
%108 = sext i32 %107 to i64<br>
<br>
The vectorized code should in this case only have to be<br>
<br>
%101 = zext <16 x i16> %100 to <16 x i64><br>
%104 = zext <16 x i16> %103 to <16 x i64><br>
%105 = sub nsw <16 x i64> %101, %104<br>
<i>for each element e 0:15</i><br>
%107 = extractelement <16 x i64> %105, i32 e<br>
<br>
,but this does not get handled so for all the 16
elements, extracts *and extends* are done.<br>
<br>
I see that there is a special function in SLP vectorizer
that does this truncation and extract+extend whenever
possible. Is this the place to fix this?<br>
<br>
Or would it be better to rely on InstCombiner?<br>
<br>
Is this truncation done by SLP with the assumption that
it is free to extend an extracted element? On SystemZ,
this is not true.<span class="HOEnZb"><font
color="#888888"><br>
<br>
/Jonas<br>
<br>
</font></span></p>
</div>
<br>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a moz-do-not-send="true"
href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"
rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>