[Libclc-dev] [PATCH 2/3] amdgcn-amdhsa: Add get_num_groups implementation

Jan Vesely via Libclc-dev libclc-dev at lists.llvm.org
Fri Sep 16 17:02:47 PDT 2016


On Fri, 2016-09-16 at 18:33 -0400, Tom Stellard wrote:
> On Wed, Sep 14, 2016 at 11:18:12AM -0400, Jan Vesely via Libclc-dev
> wrote:
> > 
> > On Wed, 2016-09-14 at 10:58 -0400, Matt Arsenault via Libclc-dev
> > wrote:
> > > 
> > > > 
> > > > 
> > > > On Sep 14, 2016, at 07:22, Tom Stellard via Libclc-dev <libclc-
> > > > dev@
> > > > lists.llvm.org> wrote:
> > > > 
> > > > +_CLC_DEF size_t get_num_groups(uint dim) {
> > > > +  size_t global_size = get_global_size(dim);
> > > > +  size_t local_size = get_local_size(dim);
> > > > +  size_t num_groups = global_size / local_size;
> > > > +  if (global_size % local_size != 0) {
> > > > +    num_groups++;
> > > > +  }
> > > > +  return num_groups;
> > > LGTM. I hope the % does get optimized out?
> > 
> > AMDGPU implements DIVREM for 64bit integers, so / and % are
> > computed at
> > the same time. It's still rather expensive (~400 instructions).
> > 
> > (global_size + local_size -1) / local_size
> > 
> > allows elimination of REM only parts of DIVREM (although the
> > savings
> > are negligible).
> > 
> 
> I would really prefer to have the runtime compute this, so I think we
> can replace this with something better in the future.

what's the benefit/reason? even if the computation did not involve
division it would need more than 3 registers/parameters to compute.

Jan

> 
> -Tom
> 
> > 
> > Jan
> > 
> > > 
> > > _______________________________________________
> > > Libclc-dev mailing list
> > > Libclc-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > -- 
> > Jan Vesely <jv356 at scarletmail.rutgers.edu>
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20160916/8d583dd0/attachment.sig>


More information about the Libclc-dev mailing list