[LLVMdev] NVPTX: __iAtomicCAS support ?

Dmitry N. Mikushin maemarcus at gmail.com
Thu May 17 10:36:16 PDT 2012


Thanks, Justin!

Clang does not implement atomic intrinsics, but fortunately supports inline asm:

__inline__ __attribute__((always_inline)) __attribute__((device)) int
__iAtomicCAS(int *p, int compare, int val)
{
        int *global, result;
        asm(
                "cvta.to.global.u64 %0, %1;\n\t"
                "atom.global.cas.b32 %2, [%0], %3, %4;"
                :: "l"(global), "l"(p), "r"(result), "r"(compare), "r"(val));
        return result;
}

It helped to workaround this particular problem.

- D.

2012/5/16 Justin Holewinski <jholewinski at nvidia.com>:
>> -----Original Message-----
>> From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com]
>> Sent: Wednesday, May 16, 2012 5:44 AM
>> To: LLVM-Dev
>> Cc: Justin Holewinski
>> Subject: NVPTX: __iAtomicCAS support ?
>>
>> Dear colleagues,
>>
>> I'm looking if we can replace nvopencc with LLVM NVPTX in our project.
>> It turns NVPTX won't work with the code nvopencc can handle (please
>> see the log below). So are atomic intrinsics not supported or am I
>> doing call in a wrong way?
>
> There are really two issues here.
>
> First, the error you are seeing is because calls are disabled in the back-end until an outstanding LLVM core patch is committed.  Hopefully, we'll be able to push that in soon.
>
> Second, __iAtomicCAS() is a CUDA-C built-in function; the implementation is provided by a library linked with the LLVM IR before the NVPTX back-end sees it.  You will need to provide your own implementations for such functions.
>
>>
>> Thanks,
>> - Dima.
>>
>> SOURCE
>> ========
>>
>> dmikushin at hp2:~> cat kernelgen_monitor.ll
>> ; ModuleID = '/opt/kernelgen/include/kernelgen_monitor.cu'
>> target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
>> target triple = "ptx64-unknown-unknown"
>>
>> %struct.kernelgen_callback_t = type { i32, i32,
>> %"struct.kernelgen::kernel_t"*, i32, i32,
>> %struct.kernelgen_callback_data_t* }
>> %"struct.kernelgen::kernel_t" = type opaque
>> %struct.kernelgen_callback_data_t = type opaque
>>
>> define ptx_kernel void @_Z17kernelgen_monitorPi(i32* %callback)
>> nounwind {
>> entry:
>>   %callback.addr = alloca i32*, align 8
>>   store i32* %callback, i32** %callback.addr, align 8
>>   %0 = load i32** %callback.addr, align 8
>>   %1 = bitcast i32* %0 to %struct.kernelgen_callback_t*
>>   %lock = getelementptr inbounds %struct.kernelgen_callback_t* %1, i32 0,
>> i32 0
>>   %call = call ptx_device i32 @_Z12__iAtomicCASPiii(i32* %lock, i32 1, i32 0)
>>   br label %while.cond
>>
>> while.cond:                                       ; preds = %while.body, %entry
>>   %2 = load i32** %callback.addr, align 8
>>   %3 = bitcast i32* %2 to %struct.kernelgen_callback_t*
>>   %lock1 = getelementptr inbounds %struct.kernelgen_callback_t* %3, i32 0,
>> i32 0
>>   %call2 = call ptx_device i32 @_Z12__iAtomicCASPiii(i32* %lock1, i32 1, i32 1)
>>   %tobool = icmp ne i32 %call2, 0
>>   %lnot = xor i1 %tobool, true
>>   br i1 %lnot, label %while.body, label %while.end
>>
>> while.body:                                       ; preds = %while.cond
>>   br label %while.cond
>>
>> while.end:                                        ; preds = %while.cond
>>   ret void
>> }
>>
>> declare ptx_device i32 @_Z12__iAtomicCASPiii(i32*, i32, i32)
>>
>> CODEGEN
>> =========
>>
>> dmikushin at hp2:~> llc < kernelgen_monitor.ll -march=nvptx -mcpu=sm_20
>> //
>> // Generated by LLVM NVPTX Back-End
>> //
>>
>> .version 3.0
>> .target sm_20, texmode_independent
>> .address_size 32
>>
>> .func  (.param .b32 func_retval0) _Z12__iAtomicCASPiii
>> (
>>       .param .b32 _Z12__iAtomicCASPiii_param_0,
>>       .param .b32 _Z12__iAtomicCASPiii_param_1,
>>       .param .b32 _Z12__iAtomicCASPiii_param_2
>> )
>> ;
>>
>> Not Implemented
>> UNREACHABLE executed at
>> /tmp/rpmbuild_debug/BUILD/llvm/build/include/llvm/Target/TargetLowerin
>> g.h:1249!
>> 0  libLLVM-3.2svn.so 0x00007f47738b8f5f
>> 1  libLLVM-3.2svn.so 0x00007f47738b9525
>> 2  libpthread.so.0   0x00007f47726135d0
>> 3  libc.so.6         0x00007f4771931945 gsignal + 53
>> 4  libc.so.6         0x00007f4771932f21 abort + 385
>> 5  libLLVM-3.2svn.so 0x00007f47738a24c1
>> llvm::report_fatal_error(llvm::Twine const&) + 0
>> 6  libLLVM-3.2svn.so 0x00007f47735cd390
>> 7  libLLVM-3.2svn.so 0x00007f47737fe2ba
>> llvm::TargetLowering::LowerCallTo(llvm::SDValue, llvm::Type*, bool,
>> bool, bool, bool, unsigned int, llvm::CallingConv::ID, bool, bool,
>> bool, llvm::SDValue, std::vector<llvm::TargetLowering::ArgListEntry,
>> std::allocator<llvm::TargetLowering::ArgListEntry> >&,
>> llvm::SelectionDAG&, llvm::DebugLoc) const + 2120
>> 8  libLLVM-3.2svn.so 0x00007f4773807199
>> llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite,
>> llvm::SDValue, bool, llvm::MachineBasicBlock*) + 2913
>> 9  libLLVM-3.2svn.so 0x00007f477381b3af
>> llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) + 9681
>> 10 libLLVM-3.2svn.so 0x00007f477382abee
>> llvm::SelectionDAGBuilder::visit(unsigned int, llvm::User const&) +
>> 1044
>> 11 libLLVM-3.2svn.so 0x00007f477382ad6d
>> llvm::SelectionDAGBuilder::visit(llvm::Instruction const&) + 105
>> 12 libLLVM-3.2svn.so 0x00007f4773844925
>> llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::Instruction
>> const>, llvm::ilist_iterator<llvm::Instruction const>, bool&) + 59
>> 13 libLLVM-3.2svn.so 0x00007f477384540c
>> llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) +
>> 2620
>> 14 libLLVM-3.2svn.so 0x00007f47738459c6
>> llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) +
>> 896
>> 15 libLLVM-3.2svn.so 0x00007f4773175bfe
>> llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 82
>> 16 libLLVM-3.2svn.so 0x00007f47733ac299
>> llvm::FPPassManager::runOnFunction(llvm::Function&) + 331
>> 17 libLLVM-3.2svn.so 0x00007f47733ac474
>> llvm::FPPassManager::runOnModule(llvm::Module&) + 86
>> 18 libLLVM-3.2svn.so 0x00007f47733abf6d
>> llvm::MPPassManager::runOnModule(llvm::Module&) + 381
>> 19 libLLVM-3.2svn.so 0x00007f47733ad6eb
>> llvm::PassManagerImpl::run(llvm::Module&) + 111
>> 20 libLLVM-3.2svn.so 0x00007f47733ad74d
>> llvm::PassManager::run(llvm::Module&) + 33
>> 21 llc               0x000000000040eed6 main + 2835
>> 22 libc.so.6         0x00007f477191dbc6 __libc_start_main + 230
>> 23 llc               0x000000000040cc09
>> Stack dump:
>> 0.    Program arguments: llc -march=nvptx -mcpu=sm_20
>> 1.    Running pass 'Function Pass Manager' on module '<stdin>'.
>> 2.    Running pass 'NVPTX DAG->DAG Pattern Instruction Selection' on
>> function '@_Z17kernelgen_monitorPi'
>> Aborted
>> dmikushin at hp2:~> cd ~/rpmbuild/BUILD/llvm/ && svn info
>> Path: .
>> Working Copy Root Path: /tmp/rpmbuild_debug/BUILD/llvm
>> URL: http://llvm.org/svn/llvm-project/llvm/trunk
>> Repository Root: http://llvm.org/svn/llvm-project
>> Repository UUID: 91177308-0d34-0410-b5e6-96231b3b80d8
>> Revision: 156703
>> Node Kind: directory
>> Schedule: normal
>> Last Changed Author: foad
>> Last Changed Rev: 156703
>> Last Changed Date: 2012-05-12 12:30:16 +0400 (Sat, 12 May 2012)
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and may contain
> confidential information.  Any unauthorized review, use, disclosure or distribution
> is prohibited.  If you are not the intended recipient, please contact the sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------




More information about the llvm-dev mailing list