[LLVMdev] PTX backend fatal error

Villmow, Micah Micah.Villmow at amd.com
Mon Nov 14 09:55:10 PST 2011


Justin,
Add this to your TargetLowering constructor, this fixes the mem* issue.

  maxStoresPerMemcpy  = 4096;
  maxStoresPerMemmove = 4096;
  maxStoresPerMemset  = 4096;

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Justin Holewinski
Sent: Monday, November 14, 2011 7:12 AM
To: Alberto Magni
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] PTX backend fatal error

On Mon, Nov 14, 2011 at 8:57 AM, Alberto Magni <alberto.magni86 at gmail.com<mailto:alberto.magni86 at gmail.com>> wrote:
Hi everybody,

I am testing the PTX backend using the OpenCL NVIDIA SDK benchmarks.
Compiling the Histogram64.cl program I get a several backend errors.

I isolated one of them in the following kernel program:

__kernel void kernel_function(__global int *input) {
   __local char localArray[16];
   for(unsigned int index = 0; index < 16; ++index)
     localArray[index] = 0;
   input[0] = localArray[get_local_id(0)];
}

fatal error: error in backend: Cannot select:
     0x5810cc0: i32,ch = load 0x57fa148,
     0x5810ac0, 0x58105c0<LD1[%arrayidx1], sext
     from i8> [ID=9]
 0x5810ac0: i32 = add 0x58109c0, 0x5813640 [ORD=113] [ID=8]
   0x58109c0: i32 = PTXISD::COPY_ADDRESS 0x5813540 [ID=7]
     0x5813540: i32 = TargetGlobalAddress<[16 x i8] addrspace(4)*
@kernel_function.localArray> 0 [ID=4]
   0x5813640: i32,ch = load 0x57fa148, 0x5810dc0,
0x58105c0<LD4[%retval.i]> [ORD=110] [ID=5]
     0x5810dc0: i32 = FrameIndex<0> [ORD=110] [ID=1]
     0x58105c0: i32 = undef [ORD=110] [ID=2]
 0x58105c0: i32 = undef [ORD=110] [ID=2]

The command I am using is:

clang kernels/fatal_error_test.cl<http://fatal_error_test.cl> -O0 -include ocldef.h -include
builtin_functions_ptx.cl<http://builtin_functions_ptx.cl>
                                                  -D__x86_64__
-ccc-host-triple ptx32 -Xclang
                                                  -target-feature
-Xclang +ptx23 -Xclang
                                                  -target-feature
-Xclang +compute20

Any ideas ?

Unfortunately, this sample will not work at this time.  First, the backend does not support i8 types yet.  Second, at higher optimization levels, LLVM turns this loop into a memset intrinsic, which is also not yet implemented. :(

Hopefully I'll get some time soon to work on this, and other deficiencies. Patches are always welcome, too.


Best regards

Alberto
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



--
Thanks,

Justin Holewinski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111114/972b1003/attachment.html>


More information about the llvm-dev mailing list