[LLVMdev] PTX backend fatal error
Villmow, Micah
Micah.Villmow at amd.com
Mon Nov 14 09:55:10 PST 2011
Justin,
Add this to your TargetLowering constructor, this fixes the mem* issue.
maxStoresPerMemcpy = 4096;
maxStoresPerMemmove = 4096;
maxStoresPerMemset = 4096;
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Justin Holewinski
Sent: Monday, November 14, 2011 7:12 AM
To: Alberto Magni
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] PTX backend fatal error
On Mon, Nov 14, 2011 at 8:57 AM, Alberto Magni <alberto.magni86 at gmail.com<mailto:alberto.magni86 at gmail.com>> wrote:
Hi everybody,
I am testing the PTX backend using the OpenCL NVIDIA SDK benchmarks.
Compiling the Histogram64.cl program I get a several backend errors.
I isolated one of them in the following kernel program:
__kernel void kernel_function(__global int *input) {
__local char localArray[16];
for(unsigned int index = 0; index < 16; ++index)
localArray[index] = 0;
input[0] = localArray[get_local_id(0)];
}
fatal error: error in backend: Cannot select:
0x5810cc0: i32,ch = load 0x57fa148,
0x5810ac0, 0x58105c0<LD1[%arrayidx1], sext
from i8> [ID=9]
0x5810ac0: i32 = add 0x58109c0, 0x5813640 [ORD=113] [ID=8]
0x58109c0: i32 = PTXISD::COPY_ADDRESS 0x5813540 [ID=7]
0x5813540: i32 = TargetGlobalAddress<[16 x i8] addrspace(4)*
@kernel_function.localArray> 0 [ID=4]
0x5813640: i32,ch = load 0x57fa148, 0x5810dc0,
0x58105c0<LD4[%retval.i]> [ORD=110] [ID=5]
0x5810dc0: i32 = FrameIndex<0> [ORD=110] [ID=1]
0x58105c0: i32 = undef [ORD=110] [ID=2]
0x58105c0: i32 = undef [ORD=110] [ID=2]
The command I am using is:
clang kernels/fatal_error_test.cl<http://fatal_error_test.cl> -O0 -include ocldef.h -include
builtin_functions_ptx.cl<http://builtin_functions_ptx.cl>
-D__x86_64__
-ccc-host-triple ptx32 -Xclang
-target-feature
-Xclang +ptx23 -Xclang
-target-feature
-Xclang +compute20
Any ideas ?
Unfortunately, this sample will not work at this time. First, the backend does not support i8 types yet. Second, at higher optimization levels, LLVM turns this loop into a memset intrinsic, which is also not yet implemented. :(
Hopefully I'll get some time soon to work on this, and other deficiencies. Patches are always welcome, too.
Best regards
Alberto
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
--
Thanks,
Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111114/972b1003/attachment.html>
More information about the llvm-dev
mailing list