[Libclc-dev] Questions about atomics
awatry at gmail.com
Sun Sep 8 07:35:55 PDT 2013
I believe that there are a few more. Usually it's things that
explicitly require hardware knowledge/support that can't be
implemented in generic terms.
This will probably include anything that depends on intrinsics for a
specific hardware back-end or which deal explicitly with address
spaces. I believe that we've taken into account the fact that we also
have to handle differing pointer sizes (the Radeon SI hardware uses
different pointer sizes for different address spaces I believe.. or
maybe it's just that R600 and SI have different pointer sizes). Tom
can probably clarify anything here that I've gotten wrong, since he's
much more knowledgeable about the hardware specifics than I.
Anyway, functions without a generic implementation which have
- - get_global/local_* (depends on hardware intrinsics for at least R600)
- write_mem_fence* (not yet implemented at all, but it will probably be similar)
- all of the atomic_* functions (which work with either global or
local address spaces)
- Probably others in the future
On Sat, Sep 7, 2013 at 1:10 PM, Jeroen Ketema <j.ketema at imperial.ac.uk> wrote:
> Hi Aaron,
> Thanks for your answer; it helps. I was wondering what was going on, because having just prototype of atomic_add in the generic part without an implementation was causing some problems for us.
> This makes me wonder: Do you happen to know if there are any other prototypes in the generic part that do not have an implementation in the generic part?
> On 7 sep. 2013, at 18:05, Aaron Watry <awatry at gmail.com> wrote:
>> Hi Jeroen,
>> I actually just committed atomic_sub and atomic_dec support yesterday :)
>> The libclc support was waiting to be added in libclc due to missing
>> support for the relevant instructions in the llvm R600 back-end. I
>> finally committed that support to llvm yesterday, and then immediately
>> after, I pushed the libclc support.
>> With regards to the _addr* naming. The R600 back-end uses address
>> space 1 as global, 2 is constant, 3 is local, and private is either 0
>> or 4 (I obviously haven't used that one much).
>> By defining the generic functions in terms of address space 1/2/3/4,
>> we just have to write the assembly functions once, and then we just
>> have to map which named address space is which ID on various hardware
>> back-ends. Hence the split for the implementations in generic/ and
>> the mappings defined in r600/. Theoretically, if we wanted to then
>> add Nvidia/Intel/X86 support for a given function, we'd just have to
>> map the correct implementation to the numbered address space needed.
>> Hope that helps,
>> On Fri, Sep 6, 2013 at 8:54 PM, Jeroen Ketema <j.ketema at imperial.ac.uk> wrote:
>>> I was looking at the code for atomics that was committed in the last two days and I'm wondering about two things. First, is there a reason that atomic_sub is not introduced in the r600 code (atomic_add is there)? Second, why is atomic_add only defined as the generic @__clc_atomic_add_addr… in the r600 code (since @__clc_atomic_add_addr… occurs in the generic code I expect the definition of atomic_add to be there too)?
>>> Libclc-dev mailing list
>>> Libclc-dev at pcc.me.uk
More information about the Libclc-dev