[cfe-dev] OpenCL/CUDA Interop with PTX Back-End

Justin Holewinski justin.holewinski at gmail.com
Tue Oct 4 13:26:19 PDT 2011


On Tue, Oct 4, 2011 at 3:53 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:

> On Tue, Oct 04, 2011 at 07:28:26PM +0100, Peter Collingbourne wrote:
> > > and the OpenCL frontend seems to respect the address
> > >    mapping but does not emit complete array definitions for
> locally-defined
> > >    __local arrays.  Does the front-end currently not support __local
> arrays
> > >    embedded in the code?  It seems to work if the __local arrays are
> passed as
> > >    pointers to the kernel.
> >
> > Clang should support __local arrays, and this looks like a genuine
> > bug in the IR generator.  I will investigate.
>
> This actually seems to be an optimisation.  Since only the first
> element of the array is accessed, LLVM will only allocate storage for
> that element.  If you compile your example with -O0 (OpenCL compiles
> with optimisations turned on by default), you will see that the 64
> element array is created.
>

I'm not really convinced this is a legal optimization.  What if you
purposely allocate arrays with extra padding to prevent bank conflicts in
the kernel?


>
> Thanks,
> --
> Peter
>



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20111004/24b0d6d7/attachment.html>


More information about the cfe-dev mailing list