[PATCH] D69826: [hip] Enable pointer argument lowering through coercing type.

Michael Liao via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 5 10:38:20 PST 2019


hliao added a comment.

In D69826#1734296 <https://reviews.llvm.org/D69826#1734296>, @yaxunl wrote:

> I am a little bit concerned that user may have such code:
>
>   struct A { int *p; }
>   __global__ kernel(A a) {
>     int x;
>     a.p = &x;
>     f(a);
>   }
>
>
> @arsenm what happens if a private pointer is mis-used as a global pointer?
>
> I am wondering if we should coerce byval struct kernel arg to global only if they are const, e.g.
>
>   __global__ kernel(const A a);
>
>
> I understand this may lose performance. Or should we introduce an option to let user disable coerce of non-const struct kernel arg to global.


This should not be a concern. The coercing is only applied to the parameter itself. Within the function body, we still use the original `struct A`. The preparation in function prolog will copy that coerced argument into the original one (alloca-ed.) The modification of that parameter later will be applied to the original one due to the by-val nature.

A modified version of your code is compiled into the following code at O0:

  define protected amdgpu_kernel void @_Z3foo1A(%struct.A.coerce %a.coerce) #0 {
  entry:
    %a = alloca %struct.A, align 8, addrspace(5)
    %a1 = addrspacecast %struct.A addrspace(5)* %a to %struct.A*
    %x = alloca i32, align 4, addrspace(5)
    %x.ascast = addrspacecast i32 addrspace(5)* %x to i32*
    %agg.tmp = alloca %struct.A, align 8, addrspace(5)
    %agg.tmp.ascast = addrspacecast %struct.A addrspace(5)* %agg.tmp to %struct.A*
    %0 = bitcast %struct.A* %a1 to %struct.A.coerce*
    %1 = getelementptr inbounds %struct.A.coerce, %struct.A.coerce* %0, i32 0, i32 0
    %2 = extractvalue %struct.A.coerce %a.coerce, 0
    store i32 addrspace(1)* %2, i32 addrspace(1)** %1, align 8
    %3 = getelementptr inbounds %struct.A.coerce, %struct.A.coerce* %0, i32 0, i32 1
    %4 = extractvalue %struct.A.coerce %a.coerce, 1
    store i32 addrspace(1)* %4, i32 addrspace(1)** %3, align 8
    %p = getelementptr inbounds %struct.A, %struct.A* %a1, i32 0, i32 0
    store i32* %x.ascast, i32** %p, align 8
    %5 = bitcast %struct.A* %agg.tmp.ascast to i8*
    %6 = bitcast %struct.A* %a1 to i8*
    call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %5, i8* align 8 %6, i64 16, i1 false)
    %7 = getelementptr inbounds %struct.A, %struct.A* %agg.tmp.ascast, i32 0, i32 0
    %8 = load i32*, i32** %7, align 8
    %9 = getelementptr inbounds %struct.A, %struct.A* %agg.tmp.ascast, i32 0, i32 1
    %10 = load i32*, i32** %9, align 8
    call void @_Z1f1A(i32* %8, i32* %10) #3
    ret void
  }

The modification of parameter `a` is applied the alloca-ed one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69826/new/

https://reviews.llvm.org/D69826





More information about the cfe-commits mailing list