[llvm-dev] NVPTX codegen for llvm.sin (and friends)
Johannes Doerfert via llvm-dev
llvm-dev at lists.llvm.org
Wed Mar 10 12:57:14 PST 2021
Right. We could keep the definition of __nv_cos and friends
around. Right now, -ffast-math might just crash on the user,
which is arguably a bad thing. I can also see us benefiting
in various other ways from llvm.cos uses instead of __nv_cos
(assuming precision is according to the user requirements but
that is always a condition).
It could be as simple as introducing __nv_cos into
"llvm.used" and a backend matching/rewrite pass.
If the backend knew the libdevice location it could even pick
the definitions from there. Maybe we could link libdevice late
instead of eager?
Trying to figure out a good way to have the cake and eat it too.
~ Johannes
On 3/10/21 2:49 PM, William Moses wrote:
> Since clang (and arguably any other frontend that uses) should link in
> libdevice, could we lower these intrinsics to the libdevice code?
>
> For example, consider compiling the simple device function below:
>
> ```
> // /mnt/sabrent/wmoses/llvm13/build/bin/clang tmp.cu -S -emit-llvm
> --cuda-path=/usr/local/cuda-11.0 -L/usr/local/cuda-11.0/lib64
> --cuda-gpu-arch=sm_37
> __device__ double f(double x) {
> return cos(x);
> }
> ```
>
> The LLVM module for it is as follows:
>
> ```
> ...
> define dso_local double @_Z1fd(double %x) #0 {
> entry:
> %__a.addr.i = alloca double, align 8
> %x.addr = alloca double, align 8
> store double %x, double* %x.addr, align 8
> %0 = load double, double* %x.addr, align 8
> store double %0, double* %__a.addr.i, align 8
> %1 = load double, double* %__a.addr.i, align 8
> %call.i = call contract double @__nv_cos(double %1) #7
> ret double %call.i
> }
>
> define internal double @__nv_cos(double %a) #1 {
> %q.i = alloca i32, align 4
> ```
>
> Obviously we would need to do something to ensure these functions don't get
> deleted prior to their use in lowering from intrinsic to libdevice.
> ...
>
>
> On Wed, Mar 10, 2021 at 3:39 PM Artem Belevich <tra at google.com> wrote:
>
>> On Wed, Mar 10, 2021 at 11:41 AM Johannes Doerfert <
>> johannesdoerfert at gmail.com> wrote:
>>
>>> Artem, Justin,
>>>
>>> I am running into a problem and I'm curious if I'm missing something or
>>> if the support is simply missing.
>>> Am I correct to assume the NVPTX backend does not deal with `llvm.sin`
>>> and friends?
>>>
>> Correct. It can't deal with anything that may need to lower to a standard
>> library call.
>>
>>> This is what I see, with some variations: https://godbolt.org/z/PxsEWs
>>>
>>> If this is missing in the backend, is there a plan to get this working,
>>> I'd really like to have the
>>> intrinsics in the middle end rather than __nv_cos, not to mention that
>>> -ffast-math does emit intrinsics
>>> and crashes.
>>>
>> It all boils down to the fact that PTX does not have the standard
>> libc/libm which LLVM could lower the calls to, nor does it have a 'linking'
>> phase where we could link such a library in, if we had it.
>>
>> Libdevice bitcode does provide the implementations for some of the
>> functions (though with a __nv_ prefix) and clang links it in in order to
>> avoid generating IR that LLVM can't handle, but that's a workaround that
>> does not help LLVM itself.
>>
>> --Artem
>>
>>
>>
>>> ~ Johannes
>>>
>>>
>>> --
>>> ───────────────────
>>> ∽ Johannes (he/his)
>>>
>>>
>> --
>> --Artem Belevich
>>
More information about the llvm-dev
mailing list