[llvm-dev] OpenCL toolset (for AMD GPU)

Leslie Zhai via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 28 00:45:53 PDT 2017


Oclgrind is awesome! https://github.com/jrprice/Oclgrind Although I do 
not have AMDGPU target real device llvm/lib/Target/AMDGPU but the 
simulator helps me being familiar with Host to communicate 
(clCreateBuffer) with Device :)


在 2017年06月28日 11:40, Leslie Zhai 写道:
> Hi Tom,
>
> I found it https://clang.llvm.org/docs/FAQ.html for 
> clang/include/clang/Driver/CC1Options.td
>
> 在 2017年06月27日 11:44, Leslie Zhai 写道:
>> Hi Tom,
>>
>> I found the correct mailing list finally :)
>>
>>
>> 在 2017年06月26日 19:21, Tom Stellard 写道:
>>> On 06/26/2017 04:20 AM, Leslie Zhai wrote:
>>>> Hi Tom,
>>>>
>>>> Thanks for your kind response!
>>>>
>>>>> clang -include /path/to/libclc/headers/clc.h -I 
>>>>> /path/to/libclc/headers -Dcl_clang_storage_class_specifiers 
>>>>> -target amdgcn--amdhsa -mcpu=carrizo $INPUT_FILE -o $OUTPUT_FILE
>>>> When I tried to build cos.cl testcase 
>>>> https://github.com/llvm-mirror/libclc/blob/master/test/cos.cl
>>>>
>>>> $ clang -include clc/clc.h -Dcl_clang_storage_class_specifiers 
>>>> -target amdgcn--amdhsa -mcpu=carrizo cos.cl -o cos.out
>>>>
>>>> cos.cl:1:27: error: pointer arguments to kernel functions must 
>>>> reside in '__global', '__constant' or
>>>>        '__local' address space
>>>> __kernel void foo(float4 *f) {
>>>>                            ^
>>>> 1 error generated.
>>>>
>>>> Then I added __local before float4 *f like this:
>>>>
>>>> Index: cos.cl
>>>> ===================================================================
>>>> --- cos.cl      (revision 306265)
>>>> +++ cos.cl      (working copy)
>>>> @@ -1,3 +1,3 @@
>>>> -__kernel void foo(float4 *f) {
>>>> +__kernel void foo(__local float4 *f) {
>>>>     *f = cos(*f);
>>>>   }
>>>>
>>>> because [OpenCL] Improve address space diagnostics 
>>>> https://reviews.llvm.org/D27671
>>>>
>>>> then rebuilt again,
>>>>
>>>> cos.cl:1:15: error: unsupported call to function _Z3cosDv4_f
>>>> __kernel void foo(__local float4 *f) {
>>>>                ^
>>>> 1 error generated.
>>>>
>>>> why not found *cos* function? please give me some hint, thanks a lot!
>>>>
>>> Hi,
>>>
>>> This is becuase you aren't linking the kernel with the libclc 
>>> bitcode library,
>>> try passing the path to the libclc library to clang using this option:
>>>   -mlink-bitcode-file
>>>
>>> -Tom
>> I should at first read 
>> http://lists.llvm.org/pipermail/libclc-dev/2016-May/002203.html
>>
>> $ clang -x cl -Dcl_clang_storage_class_specifiers -target 
>> amdgcn--amdhsa -mcpu=carrizo -B -Xclang -mlink-bitcode-file -Xclang 
>> clc/amdgcn--amdhsa.bc -include clc/clc.h -o cos.co cos.cl
>>
>> but,
>>
>> clang-5.0: error: unknown argument: '-mlink-bitcode-file'
>>
>> Allow linking multiple bitcode files https://reviews.llvm.org/D13913 
>> so buildbot and other developers had already checked the 
>> CC1Options.td for 'mlink-bitcode-file' argument, but how to use it 
>> correctly, please give me some hint, thanks a lot!
> $ clang -x cl -Dcl_clang_storage_class_specifiers -target 
> amdgcn--amdhsa -mcpu=carrizo -Xclang -mlink-bitcode-file 
> /usr/lib/clc/amdgcn--amdhsa.bc -include clc/clc.h -o cos.co cos.cl
>
> error: error reading '/tmp/amdgcn--amdhsa-ab2e8b.o'
> error: unable to open output file '': 'Permission denied'
> 2 errors generated.
>
> it failed to generate amdgcn--amdhsa-ab2e8b.o for 
> /usr/lib/clc/amdgcn--amdhsa.bc?
>
>
>>
>>
>> PS: it works as Ricardo mentioned!
>>
>> $ clang -Dcl_clang_storage_class_specifiers -isystem /usr/include 
>> -include clc/clc.h -target amdgcn--amdhsa -mcpu=carrizo -S -emit-llvm 
>> -xcl -o cos.ll cos.cl
>> $ llvm-link cos.ll /usr/lib/clc/amdgcn--amdhsa.bc -o cos.linked.bc
>> $ clang -target amdgcn--amdhsa -mcpu=carrizo cos.linked.bc -S -o 
>> cos.amdhsa.s
> works!
>
> $ clang -x assembler -target amdgcn--amdhsa -mcpu=carrizo -c -o cos.o 
> cos.amdhsa.s
> $ clang -target amdgcn--amdhsa cos.o -o cos.co
> $ llvm-readobj -sections -symbols -program-headers cos.co
I will test clCreateProgramWithBinary(... "cos.co" ...)!


>
>
> File: cos.co
> Format: ELF64-amdgpu-hsacobj
> Arch: amdgcn
> AddressSize: 64bit
> LoadName:
> Sections [
>   Section {
>     Index: 0
>     Name:  (0)
>     Type: SHT_NULL (0x0)
>     Flags [ (0x0)
>     ]
>     Address: 0x0
>     Offset: 0x0
>     Size: 0
>     Link: 0
>     Info: 0
>     AddressAlignment: 0
>     EntrySize: 0
>   }
>   Section {
>     Index: 1
>     Name: .note (1)
>     Type: SHT_NOTE (0x7)
>     Flags [ (0x2)
>       SHF_ALLOC (0x2)
>     ]
>     Address: 0x200
>     Offset: 0x200
>     Size: 1136
>     Link: 0
>     Info: 0
>     AddressAlignment: 4
>     EntrySize: 0
>   }
>   Section {
>     Index: 2
>     Name: .dynsym (7)
>     Type: SHT_DYNSYM (0xB)
>     Flags [ (0x2)
>       SHF_ALLOC (0x2)
>     ]
>     Address: 0x670
>     Offset: 0x670
>     Size: 48
>     Link: 4
>     Info: 1
>     AddressAlignment: 8
>     EntrySize: 24
>   }
>   Section {
>     Index: 3
>     Name: .hash (15)
>     Type: SHT_HASH (0x5)
>     Flags [ (0x2)
>       SHF_ALLOC (0x2)
>     ]
>     Address: 0x6A0
>     Offset: 0x6A0
>     Size: 24
>     Link: 2
>     Info: 0
>     AddressAlignment: 4
>     EntrySize: 4
>   }
>   Section {
>     Index: 4
>     Name: .dynstr (21)
>     Type: SHT_STRTAB (0x3)
>     Flags [ (0x2)
>       SHF_ALLOC (0x2)
>     ]
>     Address: 0x6B8
>     Offset: 0x6B8
>     Size: 5
>     Link: 0
>     Info: 0
>     AddressAlignment: 1
>     EntrySize: 0
>   }
>   Section {
>     Index: 5
>     Name: .text (29)
>     Type: SHT_PROGBITS (0x1)
>     Flags [ (0x6)
>       SHF_ALLOC (0x2)
>       SHF_EXECINSTR (0x4)
>     ]
>     Address: 0x1000
>     Offset: 0x1000
>     Size: 8132
>     Link: 0
>     Info: 0
>     AddressAlignment: 256
>     EntrySize: 0
>   }
>   Section {
>     Index: 6
>     Name: .dynamic (35)
>     Type: SHT_DYNAMIC (0x6)
>     Flags [ (0x3)
>       SHF_ALLOC (0x2)
>       SHF_WRITE (0x1)
>     ]
>     Address: 0x3000
>     Offset: 0x3000
>     Size: 96
>     Link: 4
>     Info: 0
>     AddressAlignment: 8
>     EntrySize: 16
>   }
>   Section {
>     Index: 7
>     Name: .AMDGPU.csdata (44)
>     Type: SHT_PROGBITS (0x1)
>     Flags [ (0x0)
>     ]
>     Address: 0x0
>     Offset: 0x3060
>     Size: 0
>     Link: 0
>     Info: 0
>     AddressAlignment: 1
>     EntrySize: 0
>   }
>   Section {
>     Index: 8
>     Name: .comment (59)
>     Type: SHT_PROGBITS (0x1)
>     Flags [ (0x30)
>       SHF_MERGE (0x10)
>       SHF_STRINGS (0x20)
>     ]
>     Address: 0x0
>     Offset: 0x3060
>     Size: 186
>     Link: 0
>     Info: 0
>     AddressAlignment: 1
>     EntrySize: 0
>   }
>   Section {
>     Index: 9
>     Name: .symtab (68)
>     Type: SHT_SYMTAB (0x2)
>     Flags [ (0x0)
>     ]
>     Address: 0x0
>     Offset: 0x3120
>     Size: 264
>     Link: 11
>     Info: 10
>     AddressAlignment: 8
>     EntrySize: 24
>   }
>   Section {
>     Index: 10
>     Name: .shstrtab (76)
>     Type: SHT_STRTAB (0x3)
>     Flags [ (0x0)
>     ]
>     Address: 0x0
>     Offset: 0x3228
>     Size: 94
>     Link: 0
>     Info: 0
>     AddressAlignment: 1
>     EntrySize: 0
>   }
>   Section {
>     Index: 11
>     Name: .strtab (86)
>     Type: SHT_STRTAB (0x3)
>     Flags [ (0x0)
>     ]
>     Address: 0x0
>     Offset: 0x3286
>     Size: 64
>     Link: 0
>     Info: 0
>     AddressAlignment: 1
>     EntrySize: 0
>   }
> ]
> Symbols [
>   Symbol {
>     Name:  (0)
>     Value: 0x0
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: Undefined (0x0)
>   }
>   Symbol {
>     Name: BB0_11 (1)
>     Value: 0x25D0
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_12 (8)
>     Value: 0x2A74
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_2 (15)
>     Value: 0x12E8
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_3 (21)
>     Value: 0x177C
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_5 (27)
>     Value: 0x191C
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_6 (33)
>     Value: 0x1DC0
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_8 (39)
>     Value: 0x1F94
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: BB0_9 (45)
>     Value: 0x2430
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other: 0
>     Section: .text (0x5)
>   }
>   Symbol {
>     Name: _DYNAMIC (55)
>     Value: 0x3000
>     Size: 0
>     Binding: Local (0x0)
>     Type: None (0x0)
>     Other [ (0x2)
>       STV_HIDDEN (0x2)
>     ]
>     Section: .dynamic (0x6)
>   }
>   Symbol {
>     Name: foo (51)
>     Value: 0x1000
>     Size: 8132
>     Binding: Global (0x1)
>     Type: AMDGPU_HSA_KERNEL (0xA)
>     Other: 0
>     Section: .text (0x5)
>   }
> ]
> ProgramHeaders [
>   ProgramHeader {
>     Type: PT_PHDR (0x6)
>     Offset: 0x40
>     VirtualAddress: 0x40
>     PhysicalAddress: 0x40
>     FileSize: 448
>     MemSize: 448
>     Flags [ (0x4)
>       PF_R (0x4)
>     ]
>     Alignment: 8
>   }
>   ProgramHeader {
>     Type: PT_LOAD (0x1)
>     Offset: 0x0
>     VirtualAddress: 0x0
>     PhysicalAddress: 0x0
>     FileSize: 1725
>     MemSize: 1725
>     Flags [ (0x4)
>       PF_R (0x4)
>     ]
>     Alignment: 4096
>   }
>   ProgramHeader {
>     Type: PT_LOAD (0x1)
>     Offset: 0x1000
>     VirtualAddress: 0x1000
>     PhysicalAddress: 0x1000
>     FileSize: 8132
>     MemSize: 8132
>     Flags [ (0x5)
>       PF_R (0x4)
>       PF_X (0x1)
>     ]
>     Alignment: 4096
>   }
>   ProgramHeader {
>     Type: PT_LOAD (0x1)
>     Offset: 0x3000
>     VirtualAddress: 0x3000
>     PhysicalAddress: 0x3000
>     FileSize: 96
>     MemSize: 96
>     Flags [ (0x6)
>       PF_R (0x4)
>       PF_W (0x2)
>     ]
>     Alignment: 4096
>   }
>   ProgramHeader {
>     Type: PT_DYNAMIC (0x2)
>     Offset: 0x3000
>     VirtualAddress: 0x3000
>     PhysicalAddress: 0x3000
>     FileSize: 96
>     MemSize: 96
>     Flags [ (0x6)
>       PF_R (0x4)
>       PF_W (0x2)
>     ]
>     Alignment: 8
>   }
>   ProgramHeader {
>     Type: PT_GNU_RELRO (0x6474E552)
>     Offset: 0x3000
>     VirtualAddress: 0x3000
>     PhysicalAddress: 0x3000
>     FileSize: 96
>     MemSize: 4096
>     Flags [ (0x4)
>       PF_R (0x4)
>     ]
>     Alignment: 1
>   }
>   ProgramHeader {
>     Type: PT_GNU_STACK (0x6474E551)
>     Offset: 0x0
>     VirtualAddress: 0x0
>     PhysicalAddress: 0x0
>     FileSize: 0
>     MemSize: 0
>     Flags [ (0x6)
>       PF_R (0x4)
>       PF_W (0x2)
>     ]
>     Alignment: 0
>   }
>   ProgramHeader {
>     Type: PT_NOTE (0x4)
>     Offset: 0x200
>     VirtualAddress: 0x200
>     PhysicalAddress: 0x200
>     FileSize: 1136
>     MemSize: 1136
>     Flags [ (0x4)
>       PF_R (0x4)
>     ]
>     Alignment: 4
>   }
> ]
>
>
>>
>>
>> $ clang --version
>> iSoft clang version 5.0.0 (trunk 305877) (based on LLVM 5.0.0svn)
>> Target: x86_64-isoft-linux
>> Thread model: posix
>> InstalledDir: /bin
>>
>>   Registered Targets:
>>     ...
>>     amdgcn     - AMD GCN GPUs
>>     ...
>>     avr        - Atmel AVR Microcontroller
>>     ...
>>     mips64el   - Mips64el [experimental]
>>     ...
>>     nvptx64    - NVIDIA PTX 64-bit
>>     ...
>>     r600       - AMD GPUs HD2XXX-HD6XXX
>>     ...
>>
>

-- 
Regards,
Leslie Zhai https://reviews.llvm.org/p/xiangzhai/





More information about the llvm-dev mailing list