[llvm-dev] Questions on LLVM vectorization diagnostics
Dangeti Tharun kumar via llvm-dev
llvm-dev at lists.llvm.org
Mon Jul 4 21:46:43 PDT 2016
Dear Adam Nemet,
On Thu, Jun 23, 2016 at 11:15 PM, Adam Nemet <anemet at apple.com> wrote:
> Hi Dangeti,
> On Jun 23, 2016, at 8:20 AM, Dangeti Tharun kumar via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> Dear LLVM Community,
> I am D Tharun Kumar, masters student at Indian Institute of Technology
> Hyderabad, working in a team to improve current vectorizer in LLVM. As an
> initial study, we are studying various benchmarks to analyze and compare
> vectorizing capabilities of LLVM, GCC and ICC. We found that vectorization
> remarks given by LLVM are vague and brief, comparatively GCC and ICC are
> giving detailed diagnostics.
> Yes this is an area that needs further improvement. We have some
> immediate plans to make these more useful. See the recent llvm-dev threads
> , .
> - I am interested to know why the LLVM diagnostics are brief and not
> intuitive (making them less helpful)?
> I think it’s just lack of work or weakness in the analyses to provide more
> detailed information. It would be good to file bugs for specific cases
> where we fall behind.
> - In our analysis we never seen llvm trying to vectorize outer loops.
> Is this well known? Is outer loop vectorization implemented in LLVM as in
> GCC? (http://dl.acm.org/citation.cfm?id=1454119) If not, is someone
> working on it?
> I heard various people mention this but I am not sure whether actual work
> is already taking place.
> - On the TSVC benchmark suite, out of a total of 151 loops, LLVM, GCC
> and ICC vectorized 70, 82 and 112 loops respectively. Is the cause for lag
> of LLVM the inability of LLVM’s vectorizer, or are there any (enabling)
> optimization passes running before GCC’s vectorizer that are helping GCC
> perform better?
> I don’t know about the GCC but I’ve seen ICC perform loop transformation
> more aggressively that can increase the coverage for loop vectorization.
> ICC performs Loop Distribution/Fusion/Interchange, etc by default at their
> highest optimization level. We have some of these passes (distribution,
> interchange) but not on by default yet.
> Arguably, there is also some difference between focus areas for these
> compilers. I think that ICC has a more HPC focus than LLVM or GCC. We
> have Polly which is geared toward more the HPC use cases.
> - Loop peeling to enhance vectorization is present in GCC and ICC,
> but, the LLVM remarks don’t say anything about alignment. Does LLVM has
> this functionality and the vectorizer doesn’t remark about it, or it
> doesn’t it have the functionality at all?
> We don’t have it.
> About alignment, we tried few examples as below
for(int i = 0; i<N; i++)
A[i+3] = B[i+1] + C[i+2]
LLVM-vectorizer did not responded with any remark either not-vectorized or
the reason for not vectorizing it. Our team is showing interest in
enhancing vectorizer to support unaligned structures.
Are there anyone already working on this?
> Finally, we appreciate suggestions and directions for improving the
> vectorization framework of LLVM.
> This is a pretty active area. Probably reading up on recent llvm-dev
> discussion in this area would be helpful to you.
> I would also like to know if anyone worked or is working on improving
> vectorization remarks.
> Yes we are. If you’re interested working on this area it would be good to
> Yes, we are very much interested for coordinating in this area.
> Dangeti Tharun kumar
> M.TECH Computer Science
> IIT Hyderabad
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
>  http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334
>  http://thread.gmane.org/gmane.comp.compilers.llvm.devel/99126
D Tharun kumar
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev