[LLVMdev] Vectorization: Next Steps
hfinkel at anl.gov
Tue Feb 14 08:15:53 PST 2012
If you run with -vectorize instead of -bb-vectorize it will schedule the cleanup passes for you.
Sent from my Verizon Wireless Droid
From: "Carl-Philip Hänsch" <cphaensch at googlemail.com>
To: Hal Finkel <hfinkel at anl.gov>
Cc: llvmdev at cs.uiuc.edu
Sent: Tue, Feb 14, 2012 16:10:28 GMT+00:00
Subject: Re: [LLVMdev] Vectorization: Next Steps
I tested the "restricted" keyword and it works well :)
The generated code is a bunch of shufflevector instructions, but after a
second -O3 pass, everything looks fine.
This problem is described in my ML post "passes propose passes" and occurs
here again. LLVM has so much great passes, but they cannot start again when
the code was somewhat simplified :(
Maybe that's one more reason to tell the pass scheduler to redo some passes
to find all optimizations. The core really simplifies to what I expected.
2012/2/13 Hal Finkel <hfinkel at anl.gov>
> On Mon, 2012-02-13 at 11:11 +0100, Carl-Philip Hänsch wrote:
> > I will test your suggestion, but I designed the test case to load the
> > memory directly into <4 x float> registers. So there is absolutely no
> > permutation and other swizzle or move operations. Maybe the heuristic
> > should not only count the depth but also the surrounding load/store
> > operations.
> I've attached two variants of your file, both which vectorize as you'd
> expect. The core difference between these and your original file is that
> I added the 'restrict' keyword so that the compiler can assume that the
> arrays don't alias (or, in the first case, I made them globals). You
> also probably need to specify some alignment information, otherwise the
> memory operations will be scalarized in codegen.
> > Are the load/store operations vectorized, too? (I designed the test
> > case to completely fit the SSE registers)
> > 2012/2/10 Hal Finkel
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev