[cfe-dev] [RFC][OpenMP][CUDA] Unified Offloading Support in Clang Driver

Hahnfeld, Jonas via cfe-dev cfe-dev at lists.llvm.org
Thu Feb 25 12:39:15 PST 2016


Hi Samuel,

 

From: samuelfantao at gmail.com [mailto:samuelfantao at gmail.com] On Behalf Of Samuel F Antao
Sent: Thursday, February 25, 2016 8:55 PM
To: Hahnfeld, Jonas
Cc: Samuel F Antao; cfe-dev at lists.llvm.org; Alexey Bataev; John McCall
Subject: Re: [cfe-dev] [RFC][OpenMP][CUDA] Unified Offloading Support in Clang Driver

 

Hi Jonas,

 

 

 

2016-02-25 10:51 GMT-05:00 Hahnfeld, Jonas via cfe-dev <cfe-dev at lists.llvm.org>:

Hi Samuel,

 

Thanks for your work, I would really much like to see OpenMP offloading implemented in clang!

I can’t judge the necessary changes to the driver infrastructure but will rather focus on an end-user view…

 

From: samuelfantao at gmail.com [mailto:samuelfantao at gmail.com] On Behalf Of Samuel F Antao
Sent: Thursday, February 25, 2016 1:02 AM
To: cfe-dev at lists.llvm.org
Cc: Alexey Bataev; Hal Finkel; John McCall; echristo at gmail.com; tra at google.com; Hahnfeld, Jonas
Subject: [RFC][OpenMP][CUDA] Unified Offloading Support in Clang Driver

 

[…]

g) Reflect the offloading programming model in the naming of the save-temps files.

Given that the same action is interpreted by different toolchains, if using save-temps the resulting file could be append with the programming model name by the target triple so that files don’t get overwritten.

 

E.g. for OpenMP one would get a.bc and a-openmp-<triple>.bc if the driver is invoked with 'clang -c -save-temps a.c’.

So these temporary files would all be unbundled so that they can be worked with as before? Or do you intend to save the bundled file as well?
a-openmp-<triple>.bc might get you into troubles if you are using the same triple for host and device…

 

All the intermediate files would be saved as today, only with the extra suffix to avoid conflicts. The bundled file is the output requested by the user so it would be saved regardless of the save-temps option.

 

You are right, we should also use "host" and "device" to avoid problems when the same triple is used. I think I was proposing that already in D9888, just forgot to mention it here. Sorry about that.

 

 

h) Use special options -target-offload=<triple> to specify offloading targets and delimit options meant for a toolchain.

To avoid the proliferation of driver (and possibly frontend) options that are specific for a programming model I propose a new option that would specify an offloading device and have all the options following it processed for its toolchain. This would allow using the already existing options like -mcpu or -L/-l to tune the implementation for a given machine or provide linking commands that only make sense for the device.

 

As an hypothetical example, lets assume we wanted to compile code that uses both CUDA for a nvptx64 device, OpenMP for an x86_64 device, and a powerpc64le host, one could invoke the driver as:

 

clang -target powerpc64le-ibm-linux-gnu <more host options>

-target-offload=nvptx64-nvidia-cuda -fcuda -mcpu sm_35 <more options for the nvptx toolchain>

-target-offload=x86_64-pc-linux-gnu -fopenmp <more options for the x86_64 toolchain>

-target-offload=host <more options for the host>

-target-offload=all <options for all toolchains>

 

-fcuda or -fopenmp (or any other flag specifying a programming model) associated with an offload target would specify the programming model to be used for that target, and an error would be emitted if no programming model flag is found. I am also proposing having as special target-offload devices “host” and “all” to provide a convenient way for the user to pass options for all toolchains or to the host.

Would you have to specify ‘-fopenmp’ twice if you want to use OpenMP on the host as well? I’m asking because I’m interpreting your proposal that ‘-target-offload’ consumes all following options only for this specific target…

 

Yes. However I'm open to have a behavior that infers it based on the host options or input files (like *.cu in CUDA). However, if two different programming models are used in the same compilation unit, it has to be explicitly specified.

 

That’s true. I’m fine about specifying it explicitly in the first implementation, maybe we can later on think about some “magic guessing” if only one model is provided.

 

Thanks,

Jonas

 

 

Thanks!

Samuel

 

[…]


_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160225/a01be778/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5868 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160225/a01be778/attachment.bin>


More information about the cfe-dev mailing list