<div dir="ltr"><div>I have no idea how well this code was tested, but some researchers already wrote an LLVM pass that does exactly what is requested:</div><div><a href="https://github.com/revec/llvm-revec">https://github.com/revec/llvm-revec</a></div><div>See the linked paper that they published for their experimental results.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 1, 2020 at 11:51 AM Florian Hahn via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

<br>

> On Sep 1, 2020, at 16:23, Alexandre Bique <<a href="mailto:bique.alexandre@gmail.com" target="_blank">bique.alexandre@gmail.com</a>> wrote:<br>

> <br>

> On Tue, Sep 1, 2020 at 5:10 PM Florian Hahn <<a href="mailto:florian_hahn@apple.com" target="_blank">florian_hahn@apple.com</a>> wrote:<br>

>> The loop vectorizer does not really handle loops that already operate on vectors, so that is why the loop using v4f32 does not get widened.<br>

>> <br>

>> Arguably the user explicitly asked for 4xfloat vectors in the v4f32 version, so that is what gets generated.<br>

> <br>

> In my case I have tons of legacy code written for SSE2 and if the<br>

> compiler can make a better and correct version of it, why not?<br>

<br>

Right, that’s also a reasonable argument. <br>

<br>

Cases like those should get unrolled sufficiently already and all that needs to be done is combining multiple instructions to their wider versions. This can probably be done relatively easily in either the SLP vectorizer [1]  or VectorCombine [2[<br>

<br>

[1] <a href="https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp" rel="noreferrer" target="_blank">https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp</a><br>

[2] <a href="https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Vectorize/VectorCombine.cp" rel="noreferrer" target="_blank">https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Vectorize/VectorCombine.cp</a><br>

<br>

>> (Those kinds of issues are better to discuss on <a href="https://bugs.llvm.org/" rel="noreferrer" target="_blank">https://bugs.llvm.org/</a> IMO, because it is easier to keep track of the progress on the issue).<br>

> <br>

> That is noted, but I can't think of it as a bug unless I understand the issue.<br>

<br>

I would not say it is a bug, but rather a missing transform. The naming of <a href="https://bugs.llvm.org/" rel="noreferrer" target="_blank">https://bugs.llvm.org/</a> might imply that it is only for bugs, but in practice it is used to collect missed optimizations, suggestions for new features and so on as well.<br>

<br>

Cheers,<br>

Florian<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>