[LLVMdev] Extend SLPVectorizer to struct operations that are isomorphic to vector operations?

Thu Apr 17 20:08:46 PDT 2014

Thanks for the information about SROA.  It was missing from the Julia pass list, though adding it didn’t help a larger example where the structs were being loaded and stored to/from memory.  I’ll have to poke around to figure out what scared it off (or maybe I misplaced the SROA pass).

- Arch

From: Chandler Carruth [mailto:chandlerc at google.com]
Sent: Thursday, April 17, 2014 6:11 PM
To: Nadav Rotem
Cc: Robison, Arch; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Extend SLPVectorizer to struct operations that are isomorphic to vector operations?

On Thu, Apr 17, 2014 at 4:03 PM, Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> wrote:
On Apr 17, 2014, at 3:41 PM, Robison, Arch <arch.robison at intel.com<mailto:arch.robison at intel.com>> wrote:

> While playing with SLPVectorizer, I notice that it will happily vectorize cases involving extractelement/insertelement, but won't vectorize isomorphic cases involving extractvalue/insertvalue (such as the attached example).  Is that something that could be straightforward to add to SLPVectorizer, or are there some hard issue?  In particular, the transformation would seem to require casts of structures to vectors and back.  The bitcast instruction requires a non-aggregate value.
>
> I'm thinking such vectorization might be useful for codes that use structs for tuples, like (x,y,z) coordinates or complex numbers.
Vectorization of struct values is not supported because it is not something we considered until now. It never showed up in any workload I looked at. It should not be too difficult to implement. We already insert casts when we vectorize loads and stores from memory.

So, the first thing to understand (for Arch who may not have this context) is that almost all insertvalue/extractvalue instructions should be optimized out of the IR long before any vectorizer sees it. The SROA pass completely removes these instructions.

The only time they are likely to show up is when forming (or decomposing) aggregates passed or returned by value at the LLVM IR level due to ABI concerns. It would indeed be nice if the SLPVectorizer could vectorize through these so that we end up with vector code and a tiny scalar peel right at the ABI boundary where we need to arrange the elements into specific registers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140418/154e3f20/attachment.html>