[Libclc-dev] [PATCH 1/3] Implement wait_group_events builtin v2

Jeroen Ketema j.ketema at imperial.ac.uk
Mon Sep 29 08:56:04 PDT 2014


On 25 Sep 2014, at 16:02, Tom Stellard <tom at stellard.net> wrote:

> On Wed, Sep 24, 2014 at 11:00:27AM +0100, Jeroen Ketema wrote:
>> 
>> This looks good, but I think it suffices to use a barrier with an empty set of flags, as the function only requires the copy to have completed. Not that every memory operation arising from the copy has been committed, as I wrote before.
>> 
> 
> I don't quite understand the difference here, can you give me an
> example?

See the top of this page:

 https://www.khronos.org/message_boards/showthread.php/5875-Profiling-Code/page2

Thinking a bit on how this might impact the assembly generated for SI, I must admit I’m not quite sure. First I was thinking that this implied that I could write something like the following in pseudo-SI assembly when copying from local to global memory (for brevity I’m omitting any loop that might be needed):

  ds_read
  s_waitcnt lgkmcnt(0)
  buffer_store
  s_barrier

but you might argue that a

  s_waitcnt vm_cnt(0)

is needed before the s_barrier, because only then is the copy complete (in the sense that we can use the used VGPRs for something else). In any case barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE); is definitely the safe thing to do.

Jeroen

> 
>> The above is provided that a barrier call is not removed completely in case of an empty set of flags, which I recall is currently the case for R600. I think there was a patch for that at some point; not sure what happened to that patch though.
>> 
>> Jeroen
>> 
>> On 24 Sep 2014, at 01:42, Tom Stellard <thomas.stellard at amd.com> wrote:
>> 
>>> This is a simple default implemetation which just calls barrier().
>>> 
>>> v2:
>>> - Only call barrier() once.
>>> ---
>>> generic/include/clc/async/wait_group_events.h | 1 +
>>> generic/include/clc/clc.h                     | 1 +
>>> generic/lib/SOURCES                           | 1 +
>>> generic/lib/async/wait_group_events.cl        | 5 +++++
>>> 4 files changed, 8 insertions(+)
>>> create mode 100644 generic/include/clc/async/wait_group_events.h
>>> create mode 100644 generic/lib/async/wait_group_events.cl
>>> 
>>> diff --git a/generic/include/clc/async/wait_group_events.h b/generic/include/clc/async/wait_group_events.h
>>> new file mode 100644
>>> index 0000000..799efa0
>>> --- /dev/null
>>> +++ b/generic/include/clc/async/wait_group_events.h
>>> @@ -0,0 +1 @@
>>> +void wait_group_events(int num_events, event_t *event_list);
>>> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
>>> index b8c1cb9..0dccf53 100644
>>> --- a/generic/include/clc/clc.h
>>> +++ b/generic/include/clc/clc.h
>>> @@ -138,6 +138,7 @@
>>> 
>>> /* 6.11.10 Async Copy and Prefetch Functions */
>>> #include <clc/async/prefetch.h>
>>> +#include <clc/async/wait_group_events.h>
>>> 
>>> /* 6.11.11 Atomic Functions */
>>> #include <clc/atomic/atomic_add.h>
>>> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
>>> index e4ba1d1..cefef94 100644
>>> --- a/generic/lib/SOURCES
>>> +++ b/generic/lib/SOURCES
>>> @@ -1,4 +1,5 @@
>>> async/prefetch.cl
>>> +async/wait_group_events.cl
>>> atomic/atomic_impl.ll
>>> cl_khr_global_int32_base_atomics/atom_add.cl
>>> cl_khr_global_int32_base_atomics/atom_dec.cl
>>> diff --git a/generic/lib/async/wait_group_events.cl b/generic/lib/async/wait_group_events.cl
>>> new file mode 100644
>>> index 0000000..05c9d58
>>> --- /dev/null
>>> +++ b/generic/lib/async/wait_group_events.cl
>>> @@ -0,0 +1,5 @@
>>> +#include <clc/clc.h>
>>> +
>>> +_CLC_DEF void wait_group_events(int num_events, event_t *event_list) {
>>> +  barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
>>> +}
>>> -- 
>>> 1.8.5.5
>>> 
>>> 
>>> _______________________________________________
>>> Libclc-dev mailing list
>>> Libclc-dev at pcc.me.uk
>>> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
>> 
>> 
>> _______________________________________________
>> Libclc-dev mailing list
>> Libclc-dev at pcc.me.uk
>> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev





More information about the Libclc-dev mailing list