[LLVMdev] PTX backend fatal error

Justin Holewinski justin.holewinski at gmail.com
Mon Nov 14 11:02:32 PST 2011


On Mon, Nov 14, 2011 at 12:55 PM, Villmow, Micah <Micah.Villmow at amd.com>wrote:

>  Justin,****
>
> Add this to your TargetLowering constructor, this fixes the mem* issue.***
> *
>
> ** **
>
>   maxStoresPerMemcpy  = 4096;****
>
>   maxStoresPerMemmove = 4096;****
>
>   maxStoresPerMemset  = 4096;
>

Thanks for this!  I applied it in r144551.  However, this particular kernel
still will not pass through on account of the lack of support to
sign-extend loads from i8 to i32.


> ****
>
> ** **
>
> *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On
> Behalf Of *Justin Holewinski
> *Sent:* Monday, November 14, 2011 7:12 AM
> *To:* Alberto Magni
> *Cc:* llvmdev at cs.uiuc.edu
> *Subject:* Re: [LLVMdev] PTX backend fatal error****
>
> ** **
>
> On Mon, Nov 14, 2011 at 8:57 AM, Alberto Magni <alberto.magni86 at gmail.com>
> wrote:****
>
> Hi everybody,
>
> I am testing the PTX backend using the OpenCL NVIDIA SDK benchmarks.
> Compiling the Histogram64.cl program I get a several backend errors.
>
> I isolated one of them in the following kernel program:
>
> __kernel void kernel_function(__global int *input) {
>    __local char localArray[16];
>    for(unsigned int index = 0; index < 16; ++index)
>      localArray[index] = 0;
>    input[0] = localArray[get_local_id(0)];
> }
>
> fatal error: error in backend: Cannot select:
>      0x5810cc0: i32,ch = load 0x57fa148,
>      0x5810ac0, 0x58105c0<LD1[%arrayidx1], sext
>      from i8> [ID=9]
>  0x5810ac0: i32 = add 0x58109c0, 0x5813640 [ORD=113] [ID=8]
>    0x58109c0: i32 = PTXISD::COPY_ADDRESS 0x5813540 [ID=7]
>      0x5813540: i32 = TargetGlobalAddress<[16 x i8] addrspace(4)*
> @kernel_function.localArray> 0 [ID=4]
>    0x5813640: i32,ch = load 0x57fa148, 0x5810dc0,
> 0x58105c0<LD4[%retval.i]> [ORD=110] [ID=5]
>      0x5810dc0: i32 = FrameIndex<0> [ORD=110] [ID=1]
>      0x58105c0: i32 = undef [ORD=110] [ID=2]
>  0x58105c0: i32 = undef [ORD=110] [ID=2]
>
> The command I am using is:
>
> clang kernels/fatal_error_test.cl -O0 -include ocldef.h -include
> builtin_functions_ptx.cl
>                                                   -D__x86_64__
> -ccc-host-triple ptx32 -Xclang
>                                                   -target-feature
> -Xclang +ptx23 -Xclang
>                                                   -target-feature
> -Xclang +compute20
>
> Any ideas ?****
>
> ** **
>
> Unfortunately, this sample will not work at this time.  First, the backend
> does not support i8 types yet.  Second, at higher optimization levels, LLVM
> turns this loop into a memset intrinsic, which is also not yet implemented.
> :(****
>
> ** **
>
> Hopefully I'll get some time soon to work on this, and other deficiencies.
> Patches are always welcome, too.****
>
>  ****
>
>
> Best regards
>
> Alberto
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev****
>
>
>
> ****
>
> ** **
>
> -- ****
>
> Thanks,****
>
> ** **
>
> Justin Holewinski****
>
> ** **
>



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111114/91d7ad83/attachment.html>


More information about the llvm-dev mailing list