[Libclc-dev] [PATCH 1/3] Introduce per device defines

Jan Vesely jan.vesely at rutgers.edu
Fri May 22 13:42:27 PDT 2015


On Fri, 2015-05-22 at 21:20 +0100, Jeroen Ketema wrote:
> > On 21 May 2015, at 16:03, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > 
> > On Tue, 2015-04-14 at 17:35 -0700, Tom Stellard wrote:
> >> On Tue, Apr 14, 2015 at 06:28:02PM -0400, Jan Vesely wrote:
> >>> On Mon, 2015-04-13 at 08:23 -0700, Tom Stellard wrote:
> >>>> On Sun, Apr 12, 2015 at 06:46:52PM -0400, Jan Vesely wrote:
> >>>>> Make cl_khr_fp64 define per-device.
> >>>>> This patch does not change the generated Makefile
> >>>>> 
> >>>>> Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> >>>>> ---
> >>>>> I've tried to find a way to make libclc more device specific.
> >>>>> This series handles fp64, but I plan to extend it to other defines like
> >>>>> __CLC_HAVE_FMA, or (with Tom's recent patches) __CLC_HAVE_LDEXP.
> >>>>> 
> >>>>> Alternatives I could think of was to try to get the information from clang,
> >>>>> but I don't think it should provide that kind of low level information.
> >>>>> Or add r600 support to librt.
> >>>>> 
> >>>> 
> >>>> Clang is already defining cl_khr_fp64 for SI+ devices.  I think these other
> >>>> define belong in clang too.  Clang's X86 front-end defines a __FMA__ macro,
> >>>> plus macros for other instructions, so I think we should follow the same
> >>>> convention for r600/si.
> >>> 
> >>> I found the clang target definitions. It does not really solve the
> >>> problem of having the information in two different places, but I guess
> >>> clang is a better place than libclc. I've prepared a patch to get it
> >>> working on all asics.
> >>> Do we want this solution for NVPTX/until 3.7 is released?
> >>> so the process would be to add a desired define to libclc/configure, and
> >>> clang, and remove it from configure when the respective clang version
> >>> gets released?
> >>> 
> >>> jan
> >>> 
> >> 
> >> I would leave NVPTX as is and let people who work on it make the
> >> decision about what to do.
> >> 
> >> In general, I think we can remove defines from libclc ToT that exit
> >> in clang ToT.
> > 
> > should I go ahead with 1/3, and adapt 2/3 to remove cl_khr_fp64 for all
> > amd targets, now that clang provides the define?
> > what about EdB's effort to maintain 3.6 compatibility?
> 
> To what extent do the recent patches depend on changes in clang/llvm
> (besides the cl_khr_fp64 change you mention above)? If there are quite
> a few of these, then it doesn’t seem worth keeping 3.6 compatibility, and
> I would be in favour of creating a release_36 branch for the last known
> good version for llvm 3.6; just as we did for llvm 3.5.

Right now there's only __HAS_LDEXPF__ that is used for r600.
If the define is not present (like older llvm), it should use the __clc
version of that function. so it should be safe (albeit slower) to use
llvm3.6. it was my intention to follow this pattern (provide only
positive information defines).

cl_krh_fp64 is provided only by clang3.7.
changing patch 2/3 to remove -Dcl_khr_fp64 from all AMD targets will
remove fp64 functions from AMD hw on llvm3.6. I have no idea whether
fp64 works at all with 3.6 (I'd say it does not).

the posted versions of 1/3, and 2/3 are safe for llvm3.6

jan


> 
> Jeroen
> 
> > 
> > jan
> > 
> >> 
> >> -Tom
> >> 
> >> 
> >> 
> >>>> 
> >>>>> Jan
> >>>>> 
> >>>>> configure.py | 32 +++++++++++++++++++++-----------
> >>>>> 1 file changed, 21 insertions(+), 11 deletions(-)
> >>>>> 
> >>>>> diff --git a/configure.py b/configure.py
> >>>>> index 7d4b537..575989a 100755
> >>>>> --- a/configure.py
> >>>>> +++ b/configure.py
> >>>>> @@ -88,16 +88,25 @@ if not cxx_compiler:
> >>>>> 
> >>>>> available_targets = {
> >>>>>   'r600--' : { 'devices' :
> >>>>> -               [{'gpu' : 'cedar',   'aliases' : ['palm', 'sumo', 'sumo2', 'redwood', 'juniper']},
> >>>>> -                {'gpu' : 'cypress', 'aliases' : ['hemlock']},
> >>>>> -                {'gpu' : 'barts',   'aliases' : ['turks', 'caicos']},
> >>>>> -                {'gpu' : 'cayman',  'aliases' : ['aruba']}]},
> >>>>> +               [{'gpu' : 'cedar',   'aliases' : ['palm', 'sumo', 'sumo2', 'redwood', 'juniper'],
> >>>>> +                 'defines' : ['cl_khr_fp64']},
> >>>>> +                {'gpu' : 'cypress', 'aliases' : ['hemlock'],
> >>>>> +                 'defines' : ['cl_khr_fp64']},
> >>>>> +                {'gpu' : 'barts',   'aliases' : ['turks', 'caicos'],
> >>>>> +                 'defines' : ['cl_khr_fp64']},
> >>>>> +                {'gpu' : 'cayman',  'aliases' : ['aruba'],
> >>>>> +                 'defines' : ['cl_khr_fp64']}]},
> >>>>>   'amdgcn--': { 'devices' :
> >>>>> -                [{'gpu' : 'tahiti',  'aliases' : ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire', 'kabini', 'kaveri', 'hawaii','mullins']}]},
> >>>>> -  'nvptx--'   : { 'devices' : [{'gpu' : '', 'aliases' : []}]},
> >>>>> -  'nvptx64--'   : { 'devices' : [{'gpu' : '', 'aliases' : []}] },
> >>>>> -  'nvptx--nvidiacl'   : { 'devices' : [{'gpu' : '', 'aliases' : []}] },
> >>>>> -  'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : []}] }
> >>>>> +                [{'gpu' : 'tahiti', 'aliases' : ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire', 'kabini', 'kaveri', 'hawaii','mullins'],
> >>>>> +                  'defines' : ['cl_khr_fp64']}]},
> >>>>> +  'nvptx--'   : { 'devices' : [{'gpu' : '', 'aliases' : [],
> >>>>> +                                'defines' : ['cl_khr_fp64']}]},
> >>>>> +  'nvptx64--' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> >>>>> +                                'defines' : ['cl_khr_fp64']}]},
> >>>>> +  'nvptx--nvidiacl'   : { 'devices' : [{'gpu' : '', 'aliases' : [],
> >>>>> +                                        'defines' : ['cl_khr_fp64']}]},
> >>>>> +  'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '', 'aliases' : [],
> >>>>> +                                        'defines' : ['cl_khr_fp64']}]}
> >>>>> }
> >>>>> 
> >>>>> default_targets = ['nvptx--nvidiacl', 'nvptx64--nvidiacl', 'r600--', 'amdgcn--']
> >>>>> @@ -175,13 +184,14 @@ for target in targets:
> >>>>> 
> >>>>>   for device in available_targets[target]['devices']:
> >>>>>     # The rule for building a .bc file for the specified architecture using clang.
> >>>>> +    device_defines = ' '.join(["-D%s" % define for define in device['defines']])
> >>>>>     clang_bc_flags = "-target %s -I`dirname $in` %s " \
> >>>>>                      "-fno-builtin " \
> >>>>>                      "-Dcl_clang_storage_class_specifiers " \
> >>>>> -                     "-Dcl_khr_fp64 " \
> >>>>> +                     "%s " \
> >>>>>                      "-Dcles_khr_int64 " \
> >>>>>                      "-D__CLC_INTERNAL " \
> >>>>> -                     "-emit-llvm" % (target, clang_cl_includes)
> >>>>> +                     "-emit-llvm" % (target, clang_cl_includes, device_defines)
> >>>>>     if device['gpu'] != '':
> >>>>>       clang_bc_flags += ' -mcpu=' + device['gpu']
> >>>>>     clang_bc_rule = "CLANG_CL_BC_" + target + "_" + device['gpu']
> >>>>> -- 
> >>>>> 2.1.0
> >>>>> 
> >>>>> 
> >>>>> _______________________________________________
> >>>>> Libclc-dev mailing list
> >>>>> Libclc-dev at pcc.me.uk
> >>>>> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
> >>> 
> >>> -- 
> >>> Jan Vesely <jan.vesely at rutgers.edu>
> >> 
> >> 
> > 
> > -- 
> > Jan Vesely <jan.vesely at rutgers.edu>
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at pcc.me.uk
> > http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
> 

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20150522/e6350c10/attachment.sig>


More information about the Libclc-dev mailing list