[llvm-dev] Extending SLP Vectorizer to deal with aggregates?
Robison, Arch via llvm-dev
llvm-dev at lists.llvm.org
Wed Oct 14 13:12:10 PDT 2015
Will the cost model used by SLPVectorizer shoot down unaligned loads/stores on targets that don't support them (or support them slowly)?
I started working out details to my design, and for a first round I think it's safest to address only cases where the leaf of the chain is an array load that can be rewritten as an unaligned vector load. I.e., similar to the CanReuseExtract idea in SLPVectorizer, but further constrained to be resulting from an array load.
From: Renato Golin [mailto:renato.golin at linaro.org]
Sent: Wednesday, October 14, 2015 11:31 AM
To: Robison, Arch
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] Extending SLP Vectorizer to deal with aggregates?
On 14 October 2015 at 16:40, Robison, Arch via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 1. Identify stores of arrays. E.g. “store [4 x float]”.
> 2. Walk chains backwards from the array stores, similar to the way
> SLPVectorizer already walks chains.
> 3. If vectorization is applicable, replace array
> construction/load/store with vector construction/load/store. Vector load/stores will be unaligned.
I like this idea, too. But unaligned access may need to be constrained for the targets that support it. Not a big deal.
Another potential problem is the cost model. Right now, it's tuned to make scalar vs vector loads and shuffles be "sensible", but array patterns are slightly different. With luck, you can add gep costs to balance individual allocas.
That IR you sent could be reduced to a few vector loads, one mul-add, one vector store. :)
IIRC, AVX has the indirect shuffle, so this could even reduce the number of loads.
Seems like a good candidate for vectorizing!
More information about the llvm-dev