[Libclc-dev] [PATCH 1/3] Introduce per device defines
EdB
edb+libclc at sigluy.net
Thu May 28 01:39:20 PDT 2015
On Wednesday 27 May 2015 14:25:40 Tom Stellard wrote:
> On Wed, May 27, 2015 at 05:23:29PM -0400, Jan Vesely wrote:
> > On Wed, 2015-05-27 at 14:18 -0700, Tom Stellard wrote:
> > > On Wed, May 27, 2015 at 11:53:35AM -0400, Jan Vesely wrote:
> > > > Hi Tom, Jeroen,
> > > >
> > > > any objections to pushing 1/3?
> > > > it never got review
> > >
> > > Sorry for the delay. Why is this still needed if the defines are in
> > > clang?
> > > Is it for supporting older versions of llvm?
> >
> > yes. I planned to sen v2 of 2/3 dropping all defines for amd hw, but
> > since then EdB's patch with llvm3.6 support got in, so I guessed that
> > ppl care about llvm
> > 3.6.
>
> Is it possible to only add the defines when compiling with llvm 3.6?
Jan, thanks for carrying about llvm 3.6.
However when I made the patch in order to retain compatibly with older
version, I didn't want it to be a burden.
So, if the here no easy and clean way to keep those define for older version,
may be, as Jeroen suggests earlier, it's time for Tom to create a release_36
branch. Then, right after creating the branch, remove the file SOURCES_LLVM3.6
and the LLVM3.6 dir and only claim ToT compatibility.
>
> -Tom
>
> > Jan
> >
> > > -Tom
> > >
> > > > thank you,
> > > > jan
> > > >
> > > > On Fri, 2015-05-22 at 21:20 +0100, Jeroen Ketema wrote:
> > > > > > On 21 May 2015, at 16:03, Jan Vesely <jan.vesely at rutgers.edu>
> > > > > > wrote:
> > > > > >
> > > > > > On Tue, 2015-04-14 at 17:35 -0700, Tom Stellard wrote:
> > > > > > > On Tue, Apr 14, 2015 at 06:28:02PM -0400, Jan Vesely wrote:
> > > > > > > > On Mon, 2015-04-13 at 08:23 -0700, Tom Stellard wrote:
> > > > > > > > > On Sun, Apr 12, 2015 at 06:46:52PM -0400, Jan Vesely
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > Make cl_khr_fp64 define per-device.
> > > > > > > > > > This patch does not change the generated Makefile
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > > > > > > > > ---
> > > > > > > > > > I've tried to find a way to make libclc more device
> > > > > > > > > > specific.
> > > > > > > > > > This series handles fp64, but I plan to extend it to
> > > > > > > > > > other defines like
> > > > > > > > > > __CLC_HAVE_FMA, or (with Tom's recent patches)
> > > > > > > > > > __CLC_HAVE_LDEXP.
> > > > > > > > > >
> > > > > > > > > > Alternatives I could think of was to try to get the
> > > > > > > > > > information from clang,
> > > > > > > > > > but I don't think it should provide that kind of low
> > > > > > > > > > level information.
> > > > > > > > > > Or add r600 support to librt.
> > > > > > > > >
> > > > > > > > > Clang is already defining cl_khr_fp64 for SI+ devices. I
> > > > > > > > > think these other
> > > > > > > > > define belong in clang too. Clang's X86 front-end
> > > > > > > > > defines a __FMA__ macro,
> > > > > > > > > plus macros for other instructions, so I think we should
> > > > > > > > > follow the same
> > > > > > > > > convention for r600/si.
> > > > > > > >
> > > > > > > > I found the clang target definitions. It does not really
> > > > > > > > solve the
> > > > > > > > problem of having the information in two different places,
> > > > > > > > but I guess
> > > > > > > > clang is a better place than libclc. I've prepared a patch
> > > > > > > > to get it
> > > > > > > > working on all asics.
> > > > > > > > Do we want this solution for NVPTX/until 3.7 is released?
> > > > > > > > so the process would be to add a desired define to
> > > > > > > > libclc/configure, and
> > > > > > > > clang, and remove it from configure when the respective
> > > > > > > > clang version
> > > > > > > > gets released?
> > > > > > > >
> > > > > > > > jan
> > > > > > >
> > > > > > > I would leave NVPTX as is and let people who work on it make
> > > > > > > the
> > > > > > > decision about what to do.
> > > > > > >
> > > > > > > In general, I think we can remove defines from libclc ToT
> > > > > > > that exit
> > > > > > > in clang ToT.
> > > > > >
> > > > > > should I go ahead with 1/3, and adapt 2/3 to remove cl_khr_fp64
> > > > > > for all
> > > > > > amd targets, now that clang provides the define?
> > > > > > what about EdB's effort to maintain 3.6 compatibility?
> > > > >
> > > > > To what extent do the recent patches depend on changes in
> > > > > clang/llvm
> > > > > (besides the cl_khr_fp64 change you mention above)? If there are
> > > > > quite
> > > > > a few of these, then it doesn’t seem worth keeping 3.6
> > > > > compatibility, and
> > > > > I would be in favour of creating a release_36 branch for the last
> > > > > known
> > > > > good version for llvm 3.6; just as we did for llvm 3.5.
> > > > >
> > > > > Jeroen
> > > > >
> > > > > > jan
> > > > > >
> > > > > > > -Tom
> > > > > > >
> > > > > > > > > > Jan
> > > > > > > > > >
> > > > > > > > > > configure.py | 32 +++++++++++++++++++++-----------
> > > > > > > > > > 1 file changed, 21 insertions(+), 11 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/configure.py b/configure.py
> > > > > > > > > > index 7d4b537..575989a 100755
> > > > > > > > > > --- a/configure.py
> > > > > > > > > > +++ b/configure.py
> > > > > > > > > > @@ -88,16 +88,25 @@ if not cxx_compiler:
> > > > > > > > > >
> > > > > > > > > > available_targets = {
> > > > > > > > > >
> > > > > > > > > > 'r600--' : { 'devices' :
> > > > > > > > > > - [{'gpu' : 'cedar', 'aliases' :
> > > > > > > > > > ['palm', 'sumo', 'sumo2', 'redwood', 'juniper']},
> > > > > > > > > > - {'gpu' : 'cypress', 'aliases' :
> > > > > > > > > > ['hemlock']},
> > > > > > > > > > - {'gpu' : 'barts', 'aliases' :
> > > > > > > > > > ['turks', 'caicos']},
> > > > > > > > > > - {'gpu' : 'cayman', 'aliases' :
> > > > > > > > > > ['aruba']}]},
> > > > > > > > > > + [{'gpu' : 'cedar', 'aliases' :
> > > > > > > > > > ['palm', 'sumo', 'sumo2', 'redwood', 'juniper'],
> > > > > > > > > > + 'defines' : ['cl_khr_fp64']},
> > > > > > > > > > + {'gpu' : 'cypress', 'aliases' :
> > > > > > > > > > ['hemlock'],
> > > > > > > > > > + 'defines' : ['cl_khr_fp64']},
> > > > > > > > > > + {'gpu' : 'barts', 'aliases' :
> > > > > > > > > > ['turks', 'caicos'],
> > > > > > > > > > + 'defines' : ['cl_khr_fp64']},
> > > > > > > > > > + {'gpu' : 'cayman', 'aliases' :
> > > > > > > > > > ['aruba'],
> > > > > > > > > > + 'defines' : ['cl_khr_fp64']}]},
> > > > > > > > > >
> > > > > > > > > > 'amdgcn--': { 'devices' :
> > > > > > > > > > - [{'gpu' : 'tahiti', 'aliases' :
> > > > > > > > > > ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire',
> > > > > > > > > > 'kabini', 'kaveri', 'hawaii','mullins']}]},
> > > > > > > > > > - 'nvptx--' : { 'devices' : [{'gpu' : '', 'aliases'
> > > > > > > > > >
> > > > > > > > > > : []}]},
> > > > > > > > > >
> > > > > > > > > > - 'nvptx64--' : { 'devices' : [{'gpu' : '',
> > > > > > > > > > 'aliases' : []}] },
> > > > > > > > > > - 'nvptx--nvidiacl' : { 'devices' : [{'gpu' : '',
> > > > > > > > > > 'aliases' : []}] },
> > > > > > > > > > - 'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '',
> > > > > > > > > > 'aliases' : []}] }
> > > > > > > > > > + [{'gpu' : 'tahiti', 'aliases' :
> > > > > > > > > > ['pitcairn', 'verde', 'oland', 'hainan', 'bonaire',
> > > > > > > > > > 'kabini', 'kaveri', 'hawaii','mullins'],
> > > > > > > > > > + 'defines' : ['cl_khr_fp64']}]},
> > > > > > > > > > + 'nvptx--' : { 'devices' : [{'gpu' : '', 'aliases'
> > > > > > > > > >
> > > > > > > > > > : [],
> > > > > > > > > >
> > > > > > > > > > + 'defines' :
> > > > > > > > > > ['cl_khr_fp64']}]},
> > > > > > > > > > + 'nvptx64--' : { 'devices' : [{'gpu' : '', 'aliases'
> > > > > > > > > >
> > > > > > > > > > : [],
> > > > > > > > > >
> > > > > > > > > > + 'defines' :
> > > > > > > > > > ['cl_khr_fp64']}]},
> > > > > > > > > > + 'nvptx--nvidiacl' : { 'devices' : [{'gpu' : '',
> > > > > > > > > > 'aliases' : [],
> > > > > > > > > > + 'defines' :
> > > > > > > > > > ['cl_khr_fp64']}]},
> > > > > > > > > > + 'nvptx64--nvidiacl' : { 'devices' : [{'gpu' : '',
> > > > > > > > > > 'aliases' : [],
> > > > > > > > > > + 'defines' :
> > > > > > > > > > ['cl_khr_fp64']}]}
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > default_targets = ['nvptx--nvidiacl', 'nvptx64-
> > > > > > > > > > -nvidiacl', 'r600--', 'amdgcn--']
> > > > > > > > > >
> > > > > > > > > > @@ -175,13 +184,14 @@ for target in targets:
> > > > > > > > > > for device in available_targets[target]['devices']:
> > > > > > > > > > # The rule for building a .bc file for the
> > > > > > > > > >
> > > > > > > > > > specified architecture using clang.
> > > > > > > > > > + device_defines = ' '.join(["-D%s" % define for
> > > > > > > > > > define in device['defines']])
> > > > > > > > > >
> > > > > > > > > > clang_bc_flags = "-target %s -I`dirname $in` %s " \
> > > > > > > > > >
> > > > > > > > > > "-fno-builtin " \
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > > -Dcl_clang_storage_class_specifiers " \
> > > > > > > > > > - "-Dcl_khr_fp64 " \
> > > > > > > > > > + "%s " \
> > > > > > > > > >
> > > > > > > > > > "-Dcles_khr_int64 " \
> > > > > > > > > > "-D__CLC_INTERNAL " \
> > > > > > > > > >
> > > > > > > > > > - "-emit-llvm" % (target,
> > > > > > > > > > clang_cl_includes)
> > > > > > > > > > + "-emit-llvm" % (target,
> > > > > > > > > > clang_cl_includes, device_defines)
> > > > > > > > > >
> > > > > > > > > > if device['gpu'] != '':
> > > > > > > > > > clang_bc_flags += ' -mcpu=' + device['gpu']
> > > > > > > > > >
> > > > > > > > > > clang_bc_rule = "CLANG_CL_BC_" + target + "_" +
> > > > > > > > > >
> > > > > > > > > > device['gpu']
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at pcc.me.uk
> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
More information about the Libclc-dev
mailing list