[LLVMdev] [AMDGPU][PATCH 2/3] Stubs implementation of the new intrinsics on Evergreen

Tom Stellard tom at stellard.net
Fri Aug 22 12:05:36 PDT 2014


On Fri, Aug 22, 2014 at 11:00:41AM -0700, Matt Arsenault wrote:
> On Friday, August 22, 2014, Tom Stellard <tom at stellard.net> wrote:
> 
> > On Fri, Aug 22, 2014 at 02:33:43AM +0200, Hilloulin Damien wrote:
> > >  This patch is a first implementation of the newly added intrinsics
> > >  for barriers/memory fences on Evergreen. For ultra-simplicity, every
> > >  intrinsic is lowered to a barrier with no fence. But for real fences, we
> > >  would need to place WAIT_ACK instructions and transform surrounding
> > >  read/write instructions to make ACK.
> > >
> >
> > As I mentioned in the libclc review, the first step should be
> > to implement generic versions of memfence which just calls barrier.
> > That way we won't need to make any changes in the backend.
> 
> It is not correct to use barrier at all to implement mem_fence. barrier
> requires all work items to reach the barrier, and mem_fence does not.
> s_barrier also does not imply completed memory operations
> 
> If (get_local_id(0) == 0)
> {
>     gptr = 1;
>     mem_fence(CLK_GLOBAL_MEM_FENCE);
> }
> 
> Is a valid use of a fence but not a barrier
> 

Matt is right.  I take back what I said earlier about adding generic
implementations.  This also means that the stubs are incorrect, so you'll
need to provide real implementations, and it should be one patch per
implemented function.

-Tom

> >
> > Once the generic implementations have been committed, we can go through
> > and optimize each function one at a time.  This will make for a cleaner
> > set of changes.
> >
> > -Tom
> >
> > > Signed-off-by: Damien Hilloulin <damien.hilloulin at supelec.fr
> > <javascript:;>>
> > > ---
> > >  lib/Target/R600/EvergreenInstructions.td | 69
> > > +++++++++++++++++++++++++++++++-
> > >  1 file changed, 68 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/Target/R600/EvergreenInstructions.td
> > > b/lib/Target/R600/EvergreenInstructions.td
> > > index a83567a..c8f90ce 100644
> > > --- a/lib/Target/R600/EvergreenInstructions.td
> > > +++ b/lib/Target/R600/EvergreenInstructions.td
> > > @@ -358,8 +358,12 @@ def FLT_TO_UINT_eg : FLT_TO_UINT_Common<0x9A> {
> > >
> > >  def UINT_TO_FLT_eg : UINT_TO_FLT_Common<0x9C>;
> > >
> > >
> > +//===----------------------------------------------------------------------===//
> > > +// SYnchronization instructions
> > >
> > +//===----------------------------------------------------------------------===//
> > > +
> > >  def GROUP_BARRIER : InstR600 <
> > > -    (outs), (ins), "  GROUP_BARRIER", [(int_AMDGPU_barrier_local),
> > > (int_AMDGPU_barrier_global)], AnyALU>,
> > > +    (outs), (ins), "  GROUP_BARRIER", [], AnyALU>,
> > >      R600ALU_Word0,
> > >      R600ALU_Word1_OP2 <0x54> {
> > >
> > > @@ -389,10 +393,73 @@ def GROUP_BARRIER : InstR600 <
> > >  }
> > >
> > >  def : Pat <
> > > +    (int_AMDGPU_barrier_nofence),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +// XXX: the following patterns in the section are stubs.
> > > +// We should take care of inserting WAIT_ACK and modifying the
> > > +// read/writes instructions before the barrier and in the loop.
> > > +def : Pat <
> > > +    (int_AMDGPU_barrier_local),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > >      (int_AMDGPU_barrier_global),
> > >      (GROUP_BARRIER)
> > >  >;
> > >
> > > +def : Pat <
> > > +    (int_AMDGPU_barrier_localglobal),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_mem_fence_local),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_mem_fence_global),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_mem_fence_localglobal),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_read_mem_fence_local),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_read_mem_fence_global),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_read_mem_fence_localglobal),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_write_mem_fence_local),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_write_mem_fence_global),
> > > +    (GROUP_BARRIER)
> > > +>;
> > > +
> > > +def : Pat <
> > > +    (int_AMDGPU_write_mem_fence_localglobal),
> > > +    (GROUP_BARRIER)
> > > +>;
> > >
> > //===----------------------------------------------------------------------===//
> > >  // LDS Instructions
> > >
> > //===----------------------------------------------------------------------===//
> > > --
> > > 1.9.1
> > >
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at cs.uiuc.edu <javascript:;>
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu <javascript:;>
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >



More information about the llvm-commits mailing list