LoopVectorizer: decision when no vector instructions generated has changed

Tue May 30 07:47:07 PDT 2017

Hi Jonas,

Sorry again for the late reply. The change was intentional, and I can remove the comment when I reapply the patch (after fixing the bug you and others uncovered). Thanks! I was seeing similar issues to those described in the original patch that introduced this check (r264904). The vectorizer was effectively unrolling loops by the MaxVF, which contradicted the interleaver heuristic. Since scalarized memory instructions aren't used to set the MaxVF, it was unrolling by a factor of 16. I think it's better to let the interleaver decide what to do with these loops, since nothing is actually being vectorized. This behavior probably crept in over time as we began scalarizing more instructions.

-- Matt

-----Original Message-----
From: Jonas Paulsson [mailto:paulsson at linux.vnet.ibm.com] 
Sent: Friday, May 26, 2017 8:27 AM
To: Matthew Simpson <mssimpso at codeaurora.org>
Cc: Jonas Paulsson via llvm-commits <llvm-commits at lists.llvm.org>
Subject: LoopVectorizer: decision when no vector instructions generated has changed

Hi,

I find that now the LoopVectorizers behaviour has changed its behaviour regarding "Not considering vector loop of width X because it will not generate any vector instructions."

During my previous experiments with the Loop vectorizer, I found that it would actually vectorize a loop with no resulting vectorized instructions in the case where there is a memory access. I thought at first this was wrong, and tried changing it, only to find that this then actually made hundreds of loops then not getting vectorized. Since those loops looked perhaps better vectorized, and this was so big of a change, that I concluded that it was probably on purpose, and later suggested a comment during review of that patch:

// Note: Even if all instructions are scalarized, return true if any memory // accesses appear in the loop to get benefits from address folding etc.

I thought that most of those changed loops seemed somewhat better when vectorized (fully scalarized), giving a kind of unrolling effect. I wouldn't however really argue that this is needed.

So, I just want to point out that this affects hundreds of loops, in case this was changed unintentionally. Second, if this was not on purpose earlier and this really shouldn't be done, I guess the comment I added should just be removed.

/Jonas