[PATCH] D34367: CodeGen: Fix address space of indirect function argument

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 25 17:19:51 PDT 2017


yaxunl added inline comments.


================
Comment at: lib/CodeGen/CGCall.cpp:3861
                < Align.getQuantity()) ||
             (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) {
           // Create an aligned temporary, and copy to it.
----------------
rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > This should be comparing AST address spaces.
> > The AST address space of RV cannot be obtained through `CGFunctionInfo::const_arg_iterator it` and `it->type` since `it->type` takes type of 
> > 
> > 
> > ```
> > ImplicitCastExpr 0x60a9ff0 <col:5> 'struct S':'struct S' <LValueToRValue>
> >     `-DeclRefExpr 0x60a9f28 <col:5> '__global struct S':'__global struct S' lvalue Var 0x607efb0
> > ```
> > 
> > and the original addr space is lost due to LValueToRValue cast.
> > 
> > To get the AST addr space of RV, it seems I need to save the argument Expr in CallArgList and get it from Expr.
> > 
> I think your last two comments are related.  I'm not sure why we haven't copied into a temporary here, and if we had, the assumption of LangAS::Default would be fine.  Would you mind doing the investigation there?
It seems the backend will insert a temp copy for byval arguments, therefore normally a byval argument does not need caller to create a temp copy in LLVM IR. An explicit temp copy is only needed for special cases, e.g. alignment mismatch with ABI.

For example, the following C program,


```
struct S {
  long x[100];
};

struct S g_s;

void f(struct S s);

void g() {
  f(g_s);
}

```

will generate the following IR on x86_64:


```
target triple = "x86_64-unknown-linux-gnu"

%struct.S = type { [100 x i64] }

@g_s = common global %struct.S zeroinitializer, align 8

; Function Attrs: noinline nounwind optnone
define void @g() #0 {
entry:
  call void @f(%struct.S* byval align 8 @g_s)
  ret void
}

declare void @f(%struct.S* byval align 8) #1

```

However, the following C program


```
struct S {
  int x[100];
};

struct S g_s;

void f(struct S s);

void g() {
  f(g_s);
}

```

will generate the following IR


```
target triple = "x86_64-unknown-linux-gnu"

%struct.S = type { [100 x i32] }

@g_s = common global %struct.S zeroinitializer, align 4

; Function Attrs: noinline nounwind optnone
define void @g() #0 {
entry:
  %byval-temp = alloca %struct.S, align 8
  %0 = bitcast %struct.S* %byval-temp to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* bitcast (%struct.S* @g_s to i8*), i64 400, i32 4, i1 false)
  call void @f(%struct.S* byval align 8 %byval-temp)
  ret void
}

declare void @f(%struct.S* byval align 8) #1

```

The temp var is generated by line 3863. The control flow reaches line 3863 because the alignment of the argument is 4 but the ABI requires it to be 8, so a temp is created to match the ABI align requirement.

That means, in the OpenCL example, it is normal that a temp var is not generated at line 3848. The temp is supposed to be generated at line 3863 too, like the C example.


https://reviews.llvm.org/D34367





More information about the cfe-commits mailing list