[Libclc-dev] [PATCH 1/3] Introduce per device defines
Tom Stellard
tom at stellard.net
Tue Apr 14 17:35:13 PDT 2015
On Tue, Apr 14, 2015 at 06:28:02PM -0400, Jan Vesely wrote:
> On Mon, 2015-04-13 at 08:23 -0700, Tom Stellard wrote:
> > On Sun, Apr 12, 2015 at 06:46:52PM -0400, Jan Vesely wrote:
> > > Make cl_khr_fp64 define per-device.
> > > This patch does not change the generated Makefile
> > >
> > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > ---
> > > I've tried to find a way to make libclc more device specific.
> > > This series handles fp64, but I plan to extend it to other defines like
> > > __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
> > >
> > > Alternatives I could think of was to try to get the information from clang,
> > > but I don't think it should provide that kind of low level information.
> > > Or add r600 support to librt.
> > >
> >
> > Clang is already defining cl_khr_fp64 for SI+ devices. I think these other
> > define belong in clang too. Clang's X86 front-end defines a __FMA__ macro,
> > plus macros for other instructions, so I think we should follow the same
> > convention for r600/si.
>
> I found the clang target definitions. It does not really solve the
> problem of having the information in two different places, but I guess
> clang is a better place than libclc. I've prepared a patch to get it
> working on all asics.
> Do we want this solution for NVPTX/until 3.7 is released?
> so the process would be to add a desired define to libclc/configure, and
> clang, and remove it from configure when the respective clang version
> gets released?
>
> jan
>
I would leave NVPTX as is and let people who work on it make the
decision about what to do.
In general, I think we can remove defines from libclc ToT that exit
in clang ToT.
-Tom
> >
> > > Jan
> > >
> > > configure.py | 32 +++++++++++++++++++++-----------
> > > 1 file changed, 21 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/configure.py b/configure.py
> > > index 7d4b537..575989a 100755
> > > --- a/configure.py
> > > +++ b/configure.py
> > > @@ -88,16 +88,25 @@ if not cxx_compiler:
> > >
> > > available_targets = {
> > > 'r600--' : { 'devices' :
> > > - [{'gpu' : 'cedar', 'aliases' : ['palm', 'sumo', 'sumo2', 'redwood', 'juniper']},
> > > - {'gpu' : 'cypress', 'aliases' : ['hemlock']},
> > > - {'gpu' : 'barts', 'aliases' : ['turks', 'caicos']},
> > > - {'gpu' : 'cayman', 'aliases' : ['aruba']}]},
> > > + [{'gpu' : 'cedar', 'aliases' : ['palm', 'sumo', 'sumo2', 'redwood', 'juniper'],
> > > + 'defines' : ['cl_khr_fp64']},
> > > + {'gpu' : 'cypress', 'aliases' : ['hemlock'],
> > > + 'defines' : ['cl_khr_fp64']},
> > > + {'gpu' : 'barts', 'aliases' : ['turks', 'caicos'],
> > > + 'defines' : ['cl_khr_fp64']},
> > > + {'gpu' : 'cayman', 'aliases' : ['aruba'],
> > > + 'defines' : ['cl_khr_fp64']}]},
> > > 'amdgcn--': { 'devices' :
> > > - [{'gpu' : 'tahiti', 'aliases' : ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire', 'kabini', 'kaveri', 'hawaii','mullins']}]},
> > > - 'nvptx--' : { 'devices' : [{'gpu' : '', 'aliases' : []}]},
> > > - 'nvptx64--' : { 'devices' : [{'gpu' : '', 'aliases' : []}] },
> > > - 'nvptx--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : []}] },
> > > - 'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : []}] }
> > > + [{'gpu' : 'tahiti', 'aliases' : ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire', 'kabini', 'kaveri', 'hawaii','mullins'],
> > > + 'defines' : ['cl_khr_fp64']}]},
> > > + 'nvptx--' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> > > + 'defines' : ['cl_khr_fp64']}]},
> > > + 'nvptx64--' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> > > + 'defines' : ['cl_khr_fp64']}]},
> > > + 'nvptx--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> > > + 'defines' : ['cl_khr_fp64']}]},
> > > + 'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> > > + 'defines' : ['cl_khr_fp64']}]}
> > > }
> > >
> > > default_targets = ['nvptx--nvidiacl', 'nvptx64--nvidiacl', 'r600--', 'amdgcn--']
> > > @@ -175,13 +184,14 @@ for target in targets:
> > >
> > > for device in available_targets[target]['devices']:
> > > # The rule for building a .bc file for the specified architecture using clang.
> > > + device_defines = ' '.join(["-D%s" % define for define in device['defines']])
> > > clang_bc_flags = "-target %s -I`dirname $in` %s " \
> > > "-fno-builtin " \
> > > "-Dcl_clang_storage_class_specifiers " \
> > > - "-Dcl_khr_fp64 " \
> > > + "%s " \
> > > "-Dcles_khr_int64 " \
> > > "-D__CLC_INTERNAL " \
> > > - "-emit-llvm" % (target, clang_cl_includes)
> > > + "-emit-llvm" % (target, clang_cl_includes, device_defines)
> > > if device['gpu'] != '':
> > > clang_bc_flags += ' -mcpu=' + device['gpu']
> > > clang_bc_rule = "CLANG_CL_BC_" + target + "_" + device['gpu']
> > > --
> > > 2.1.0
> > >
> > >
> > > _______________________________________________
> > > Libclc-dev mailing list
> > > Libclc-dev at pcc.me.uk
> > > http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
>
> --
> Jan Vesely <jan.vesely at rutgers.edu>
More information about the Libclc-dev
mailing list