[PATCH] D62738: [HIP] Support device_shadow variable

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jun 20 09:38:41 PDT 2019


yaxunl marked an inline comment as done.
yaxunl added a comment.

In D62738#1538900 <https://reviews.llvm.org/D62738#1538900>, @tra wrote:

> So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility.
>  If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently used __device__ vars with host-side shadows. Only now the real variable lives on the host side and it's device side that gets the 'shadow' copy. Do I understand it correctly?
>
> If so, then this functionality seems to be fairly general-purpose to me. I.e. it has literally nothing to do with textures other than the name.
>
> Perhaps it would make more sense to rename this attribute to something along the lines of 'device_referenceable' and bring its implementation to somewhat more complete shape.
>
> By 'complete' I mean that it would be great to flesh out what can and can't use the attribute. Does it have to be a type attribute, or can it be applied to variables? 
>  The example in the patch suggests that it's the *variable* that's affected by the attribute.
>
> Once it works, HIP's texture support can use it for its purposes.
>
> E.g. your example could look like this:
>
>   #define  __attribute__((device_builtin_texture_type)) __texture__
>  
>   template <class T, int texType, enum hipTextureReadMode>
>       struct  texture
>         : public textureReference { ... }
>  
>   __texture__ texture<float, 2, hipReadModeElementType> tex;
>  
>
>
> This way compiler does not need to deal with the details of texture implementation on the HIP side.
>  Host/device visibility of the variables is easy to see in the source (similar to __device__, __shared__, etc) and there will be no need to dig into template defined somewhere else to become aware of this.
>  It will be potentially useful beyond HIP-only texture implementation.
>
> What do you think?


The problem is that we do not see generic usage of 
Although there is no texture specific handling on the compiler side, there is texture specific handling of symbols



================
Comment at: include/clang/Basic/AttrDocs.td:4164-4171
+The GNU style attribute __attribute__((device_shadow)) or MSVC style attribute
+__declspec(device_shadow) can be added to the definition of a global variable
+to indicate it is a HIP device shadow variable. A device shadow variable can
+be accessed on both device side and host side. It has external linkage and is
+not initialized on device side. It has internal linkage and is initialized by
+the initializer on host side.
+
----------------
tra wrote:
> just `device shadow variable` would do. It's no longer, generally speaking, HIP-specific. :-)
> 
> Only address and size of such variables should be used on device side.
> 
> I'd rephrase the use constraint. Currently it's `!(CUDA || !CUDA)` which is always false.
> `Currently enabled for HIP only.` would be closer to reality.
> 
If only address and size of such variables should be used on device side, such variables will not be very useful.

To implement texture reference, we need to be able to load the device side shadow variable. In general, it is desirable to load and store device side shadow variables, since users have no other way to synch with the corresponding host variable in device code.

This is different from host side shadow variable. On host side, users can use hipMemcpyToSymbol and hipMemcpyFromSymbol to force synchronization between the host side shadow variable and the corresponding device variable.

Therefore the implementation of the device side shadow variable requires special handling in HIP runtime. Basically HIP runtime needs to pin the host variable and use it to resolve the device side shadow variable (as an undefined elf symbol). This way, the host variable and device side shadow variable are sharing the same memory. This is also why it is HIP specific since CUDA runtime may not have such handling.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62738/new/

https://reviews.llvm.org/D62738





More information about the cfe-commits mailing list