[llvm-dev] PSLP: Padded SLP Automatic Vectorization

Matt P. Dziubinski via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 6 04:49:20 PDT 2020


Hi,

Thank you for your reply, this is very informative.
Good to know about the VW-SLP & PSLP trade-offs, I appreciate your 
explanation!

Best,
Matt

On 10/2/2020 22:57, Vasileios Porpodas wrote:
> Hi,
> 
> Some of the PSLP functionality was added to the SLP vectorizer with the
> support for vectorizing alternate sequences several years ago. So some
> of the most common patterns, like bundles of alternating add-subs, are
> already being taken care of by the current SLP pass. However, it won't
> do the more sophisticated padding across graphs that PSLP does.
> 
> There is indeed some overlap between VW-SLP and PSLP since they are both
> trying to form isomorphic graphs out of non-isomorphic code. But each
> one works better in different cases, so they complement each other.
> 
> For example, let's say you are in the middle of vectorizing 4-wide and
> you encounter opcodes like this: [A, A, B, B].
> Then you could either vectorize this with:
> 1. Padding the lanes and emitting shuffles (PSLP-style), or
> 2. Falling back to 2-way vectorization like in VW-SLP.
> Determining which one is better depends on the rest of the code.
> If the rest of the code is 4-wide isomorphic, then the PSLP option is
> better, if it is 2-wide isomorphic then the VW-SLP option is better.
> 
> As it is mentioned in the paper, the current SLP pass can change the
> vector-width but it cannot do so in all cases. The main focus of the
> VW-SLP paper is to highlight some of these cases and show a design for
> the SLP pass where width switching is done in a more complete way.
> 
> Performance fluctuates a lot as the compiler changes, so it is hard to
> tell what to expect from applying either of these techniques to the
> current snapshot.
> 
> Regards,
> Vasileios
> 
> "Matt P. Dziubinski" <matdzb at gmail.com> writes:
> 
>> On 9/29/2020 14:37, David Chisnall via llvm-dev wrote:
>>> On 28/09/2020 15:45, Matt P. Dziubinski via llvm-dev wrote:
>>>> Hey, I noticed this talk from the EuroLLVM 2015
>>>> (https://llvm.org/devmtg/2015-04/slides/pslp_slides_EUROLLVM2015.pdf)
>>>> on the PSLP vectorization algorithm (CGO 2015 paper:
>>>> http://vporpo.me/papers/pslp_cgo2015.pdf).
>>>>
>>>> Is anyone working on implementing it?
>>>>
>>>> If so, are there Phab reviews I can subscribe to?
>>>
>>> The CGO paper was based on a very old LLVM and the last I heard, moving
>>> the transform to a newer LLVM and rerunning the benchmarks made the
>>> speedups go away.  It's not clear what the cause of this was and the
>>> team responsible didn't have the time to do any root cause analysis.
>>
>> Thank you for the reply; interesting!
>>
>> Incidentally, would you happen to know whether VW-SLP fares any better?
>>
>> I'm referring to "VW-SLP: Auto-Vectorization with Adaptive Vector Width"
>> from PACT 2018 (http://vporpo.me/papers/vwslp_pact2018.pdf; also
>> presented as "Extending the SLP vectorizer to support variable vector
>> widths" at the 2018 LLVM Developers’ Meeting,
>> http://llvm.org/devmtg/2018-10/).
>>
>> I'm wondering whether it subsumes PSLP or whether there are areas where
>> PSLP still works (or worked) better, given the (brief) comparison in the
>> paper (from which it seems like it addresses the problem possibly in a
>> more general manner):
>>
>> "The widely used bottom-up SLP algorithm has been improved in several
>> ways. Porpodas et al. [32] propose  PSLP, a technique that pads the
>> scalar code with redundant instructions, to convert non-isomorphic
>> instruction sequences into isomorphic ones, thus extending the
>> applicability of SLP. Just like VW-SLP-S, PSLP can vectorize code when
>> some of the lanes differ, but it is most effective when the
>> non-isomorphic parts are only a short section of the instruction chain.
>> VW-SLP, on the other hand, works even if the chain never converges."
>>
>> Best,
>> Matt



More information about the llvm-dev mailing list