[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls
Jonas Hahnfeld via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jun 7 01:34:08 PDT 2018
Hahnfeld added a comment.
IMO this goes into the right direction, we should use the fast implementation in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think it's ok to use header wrappers as CUDA already does.
Two questions:
1. Can you explain where this is important for "correctness"? Yesterday I compiled a code using `sqrt` and it seems to spit out the correct results. Maybe that's relevant for other functions?
2. Incidentally I ran into a closely related problem: I can't `#include <math.h>` in translation units compiled for offloading, Clang complains about inline assembly for x86 (see below). Does that work for you?
In file included from /usr/include/math.h:413:
/usr/include/bits/mathinline.h:131:43: error: invalid input constraint 'x' in asm
__asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
^
/usr/include/bits/mathinline.h:143:43: error: invalid input constraint 'x' in asm
__asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
^
2 errors generated.
================
Comment at: lib/Headers/__clang_cuda_device_functions.h:65
}
+#if defined(__cplusplus)
__DEVICE__ void __brkpt() { asm volatile("brkpt;"); }
----------------
Why is that only valid for C++?
Repository:
rC Clang
https://reviews.llvm.org/D47849
More information about the cfe-commits
mailing list