[PATCH] D93638: [hip] Enable HIP compilation with `<complex`> on MSVC.

Michael Liao via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 7 01:46:14 PST 2021


hliao added inline comments.


================
Comment at: clang/lib/Headers/__clang_hip_runtime_wrapper.h:73-74
+#define __HOST_DEVICE__                                                        \
+  static __host__ __device__ inline __attribute__((always_inline))
+__HOST_DEVICE__ double _Cosh(double x, double y) { return cosh(x) * y; }
+__HOST_DEVICE__ float _FCosh(float x, float y) { return coshf(x) * y; }
----------------
hliao wrote:
> hliao wrote:
> > tra wrote:
> > > hliao wrote:
> > > > tra wrote:
> > > > > hliao wrote:
> > > > > > tra wrote:
> > > > > > > hliao wrote:
> > > > > > > > tra wrote:
> > > > > > > > > I don't think we want to provide a HD implementation.
> > > > > > > > > This will potentially change the meaning of these functions on the host side vs what they do in a C++ compilation.
> > > > > > > > > It should probably be just `__device__`.
> > > > > > > > > 
> > > > > > > > > Next question is -- do we want to provide the definitions, or would just declarations be sufficient?
> > > > > > > > > In other words -- are these functions needed for some standard C++ library functionality that we expect to work on the GPU?
> > > > > > > > > If it's just to aid with overload resolution, declarations should do. 
> > > > > > > > > 
> > > > > > > > These functions are declared in ymath.h and, in the host compilation, are resolved by linking MSVC RT libraries. For the device function, as we already mark all prototypes in ymath.h as HD, we have to implement them as HD to match the declaration. But, those definitions should be only available in the device compilation to avoid conflicts in the host compilation.
> > > > > > > You don't need the definitions to avoid conflicting declarations.
> > > > > > > 
> > > > > > > I'm still not convinced that HD is needed.
> > > > > > > Did you try just making these declarations `__device__` and remove the `ymath.h` wrapper?
> > > > > > > Basically I'm trying to find the bare minimum we need to do to make it work.
> > > > > > We don't call these functions directly. They are called in MSVC's <complex>. As functions in <complex> are marked as HD, we need to mark these functions in ymath.h as HD to pass the compilation.
> > > > > I assume that we're dealing with this file:
> > > > > https://github.com/microsoft/STL/blob/master/stl/inc/ymath.h
> > > > > 
> > > > > I don't think we need a wrapper for ymath.
> > > > > It may be sufficient to define or declare  `__device__ _Cosh()` and other functions and let overload resolution pick the right function.
> > > > > I think it would be a better approach than providing an `inline __host__ __device__` definition for those functions and effectively replacing MSVC-provided host-side implementation of those functions.
> > > > > 
> > > > `ymath.h` could be included before `<complex>`. That implies `_Cosh` could be declared as H only and results in the compilation failure.
> > > > BTW, I don't think replacing host-side implementation is a good idea as the host compilation should be kept consistent with the host compiler as much as possible.
> > > How? Isn't __clang_hip_runtime_wrapper.h included before anything in the user source file is processed? If the __clang_hip_runtime_wrapper.h is not included, first, then the ymath.h wrapper will not work either as it needs `__device__` macros.
> > > 
> > > > replacing host-side implementation is a good idea 
> > > 
> > > While consistency between host/device is good, we should not introduce a possible inconsistency between host-side TUs.
> > > Considering vastly larger amounts of host-side code compiled as C++ (e.g. TF has way more C++ code than HIP/CUDA) and correspondingly more [[ https://www.hyrumslaw.com/ | reliance on every possible detail of the implementation ]], I would err on the side of not changing host-side behavior in any way at all, if possible.
> > > 
> > > It's reasonably safe to add an overload (it may still be observable, but it's usually possible to add it in a way that does not affect the host). Replacing host-side things is more risky, as it greatly expands the opportunities for things to go wrong.
> > > 
> > > 
> > > 
> > `<ymath.h>` is also included in other headers, which is not wrapped. If we don't wrap `<ymath.h>`, there's a chance that it's included as it is. That's why we have to wrap `<ymath.h>` to ensure all functions marked with HD. Do I miss anything?
> I am wondering whether we could assume `<ymath.h>` is an internal header *only*.
It's turned out that `<ymath.h>` (an internal header) is included in other headers, which is not wrapped like `<complex>`. The sequence including `<ymath.h>` using MSVC 2017 is from `<algorithm>`, `<xmemory>`, `<xmemory0>`, `<limits>`, and then `<ymath.h>`. As `<algorithm>` is included before `<complex>`, without wrapping `<ymath.h>`, we cannot overload `_Cosh` (pure C function.).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93638/new/

https://reviews.llvm.org/D93638



More information about the cfe-commits mailing list