[cfe-dev] OpenCL/CUDA Interop with PTX Back-End
Justin Holewinski
justin.holewinski at gmail.com
Tue Oct 4 13:26:19 PDT 2011
On Tue, Oct 4, 2011 at 3:53 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:
> On Tue, Oct 04, 2011 at 07:28:26PM +0100, Peter Collingbourne wrote:
> > > and the OpenCL frontend seems to respect the address
> > > mapping but does not emit complete array definitions for
> locally-defined
> > > __local arrays. Does the front-end currently not support __local
> arrays
> > > embedded in the code? It seems to work if the __local arrays are
> passed as
> > > pointers to the kernel.
> >
> > Clang should support __local arrays, and this looks like a genuine
> > bug in the IR generator. I will investigate.
>
> This actually seems to be an optimisation. Since only the first
> element of the array is accessed, LLVM will only allocate storage for
> that element. If you compile your example with -O0 (OpenCL compiles
> with optimisations turned on by default), you will see that the 64
> element array is created.
>
I'm not really convinced this is a legal optimization. What if you
purposely allocate arrays with extra padding to prevent bank conflicts in
the kernel?
>
> Thanks,
> --
> Peter
>
--
Thanks,
Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111004/24b0d6d7/attachment.html>
More information about the cfe-dev
mailing list