<div dir="ltr"><div>Hi Jonas,</div><div><br></div><div>The vectorizers do attempt to type-shrink elements if possible to pack more data into vectors. It looks like that's what's happening here. This transformation is cost-modeled, but there are assumptions made about what InstCombine will be able to clean up. Would you mind filing a bug with at test case that we can take a look at?</div><div><br></div><div>-- Matt</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 24, 2017 at 8:25 AM, Jonas Paulsson via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p>Hi,<br>
<br>
I have come across a major regression resulting after SLP
vectorization (+18% on SystemZ, just for enabling SLP). This all
relates to one particular very hot loop.<br>
<br>
Scalar code:<br>
%conv252 = zext i16 %110 to i64<br>
%conv254 = zext i16 %111 to i64<br>
%sub255 = sub nsw i64 %conv252, %conv254<br>
... repeated<br>
<br>
SLP output:<br>
%101 = zext <16 x i16> %100 to <16 x i64><br>
%104 = zext <16 x i16> %103 to <16 x i64><br>
%105 = sub nsw <16 x i64> %101, %104<br>
%106 = trunc <16 x i64> %105 to <16 x i32><br>
<i>for each element e 0:15</i><br>
%107 = extractelement <16 x i32> %106, i32 e<br>
%108 = sext i32 %107 to i64<br>
<br>
The vectorized code should in this case only have to be<br>
<br>
%101 = zext <16 x i16> %100 to <16 x i64><br>
%104 = zext <16 x i16> %103 to <16 x i64><br>
%105 = sub nsw <16 x i64> %101, %104<br>
<i>for each element e 0:15</i><br>
%107 = extractelement <16 x i64> %105, i32 e<br>
<br>
,but this does not get handled so for all the 16 elements,
extracts *and extends* are done.<br>
<br>
I see that there is a special function in SLP vectorizer that does
this truncation and extract+extend whenever possible. Is this the
place to fix this?<br>
<br>
Or would it be better to rely on InstCombiner?<br>
<br>
Is this truncation done by SLP with the assumption that it is free
to extend an extracted element? On SystemZ, this is not true.<span class="HOEnZb"><font color="#888888"><br>
<br>
/Jonas<br>
<br>
</font></span></p>
</div>
<br>______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div>