[llvm-dev] SLP regression on SystemZ
Jonas Paulsson via llvm-dev
llvm-dev at lists.llvm.org
Fri Mar 24 05:25:26 PDT 2017
Hi,
I have come across a major regression resulting after SLP vectorization
(+18% on SystemZ, just for enabling SLP). This all relates to one
particular very hot loop.
Scalar code:
%conv252 = zext i16 %110 to i64
%conv254 = zext i16 %111 to i64
%sub255 = sub nsw i64 %conv252, %conv254
... repeated
SLP output:
%101 = zext <16 x i16> %100 to <16 x i64>
%104 = zext <16 x i16> %103 to <16 x i64>
%105 = sub nsw <16 x i64> %101, %104
%106 = trunc <16 x i64> %105 to <16 x i32>
/for each element e 0:15/
%107 = extractelement <16 x i32> %106, i32 e
%108 = sext i32 %107 to i64
The vectorized code should in this case only have to be
%101 = zext <16 x i16> %100 to <16 x i64>
%104 = zext <16 x i16> %103 to <16 x i64>
%105 = sub nsw <16 x i64> %101, %104
/for each element e 0:15/
%107 = extractelement <16 x i64> %105, i32 e
,but this does not get handled so for all the 16 elements, extracts *and
extends* are done.
I see that there is a special function in SLP vectorizer that does this
truncation and extract+extend whenever possible. Is this the place to
fix this?
Or would it be better to rely on InstCombiner?
Is this truncation done by SLP with the assumption that it is free to
extend an extracted element? On SystemZ, this is not true.
/Jonas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170324/d062c5f8/attachment.html>
More information about the llvm-dev
mailing list