[llvm-dev] enabling interleaved access loop vectorization

Sat Aug 6 02:12:55 PDT 2016

Two things which emerged from this whole discussion is

1.       That current costing does not account the folding of chain of “extracts” and “inserts” by InstCombine, hence Improving it will move us in positive direction.

2.       Another is the requirement to know the issue with interleave access enabled. While it will help understand the performance behavior of the vectorizer, it will also help improve the vectorizer infrastructure by properly improving and  including Ashutosh’s patch.

Vectorizer being a complex component of a compiler, having a small-small improvements with low hanging fruits is a good approach considering the testing and perf analysis involved.

Regards,
Shahid

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Michael Kuperstein via llvm-dev
Sent: Saturday, August 06, 2016 5:26 AM
To: Renato Golin
Cc: Matthew Simpson; llvm-dev
Subject: Re: [llvm-dev] enabling interleaved access loop vectorization

On Fri, Aug 5, 2016 at 4:37 PM, Renato Golin <renato.golin at linaro.org<mailto:renato.golin at linaro.org>> wrote:
On 6 August 2016 at 00:18, Michael Kuperstein <mkuper at google.com<mailto:mkuper at google.com>> wrote:
> I agree that we can get *more* improvement with better cost modeling, but
> I'd expect to be able to get *some* improvement the way things are right
> now.

Elena said she saw "some" improvements. :)

I didn't mean "some improvements, some regressions", I meant "some of the improvement we'd expect from the full solution". :-)

> That's why I'm curious about where we saw regressions - I'm wondering
> whether there's really a significant cost modeling issue I'm missing, or
> it's something that's easy to fix so that we can make forward progress,
> while Ashutosh is working on the longer-term solution.

Sounds like a task to try a few patterns and fiddle with the cost model.

Arnold did a lot of those during the first months of the vectorizer,
so it might be just a matter of finding the right heuristics, at least
for the low hanging fruits.

Of course, that'd also involve benchmarking everything else, to make
sure the new heuristics doesn't introduce regressions on
non-interleaved vectorisation.

I don't disagree with you.

All I'm saying is that before fiddling with the heuristics, it'd be good to understand what exactly breaks if we simply flip the flag. If the answer happens to be "nothing" - well, problem solved. Unfortunately, according to Elena, that's not the answer.
I'm going to play with it with our internal benchmarks, but it's my understanding that Elena/Ayal already have some idea of what the problems are.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160806/8dcabeab/attachment.html>