[LLVMdev] Vectorization: Next Steps
nadav.rotem at intel.com
Fri Feb 3 03:34:03 PST 2012
I also noticed cases where vector IR is scalariezd by the codegen. From what I have seen (which is based on a different vectorizer with a different code model, etc) there are two main areas for improvements:
1. Complex instructions - Instructions such as shuffles are very sensitive to the ability of the codegen to lower them. If a vectorizer generates shuffle instructions which are not handled properly by the manual lowering code, then the instruction is scalarized.
2. Instructions with mixed types -Instructions which operate on mixed types, such as 2xfloat->2xdouble, are usually scalarized by the type legalizer.
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Friday, February 03, 2012 10:50
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Vectorization: Next Steps
> As some of you may know, I committed my basic-block autovectorization
> pass a few days ago. I encourage anyone interested to try it out (pass
> -vectorize to opt or -mllvm -vectorize to clang) and provide feedback.
> Especially in combination with -unroll-allow-partial, I have observed
> some significant benchmark speedups, but, I have also observed some
> significant slowdowns.
codegen for vector constructs is not always that great in my experience.
It could be that your vectorizer is doing the right thing, and it's codegen that needs to be improved. For example when I use the GCC autovectorizer I often see LLVM codegen unnecessarily scalarizing the vector code. Did you try to analyse these slowdowns?
I would like to share my thoughts, and hopefully
> get feedback, on next steps.
> 1. "Target Data" for vectorization - I think that in order to improve
> the vectorization quality, the vectorizer will need more information
> about the target. This information could be provided in the form of a
> kind of extended target data. This extended target data might contain:
> - What basic types can be vectorized, and how many of them will fit
> into (the largest) vector registers
> - What classes of operations can be vectorized (division,
> conversions / sign extension, etc. are not always supported)
> - What alignment is necessary for loads and stores
> - Is scalar-to-vector free?
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the llvm-dev