[PATCH] D34367: CodeGen: Fix address space of indirect function argument
Yaxun Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Aug 25 17:19:51 PDT 2017
yaxunl added inline comments.
================
Comment at: lib/CodeGen/CGCall.cpp:3861
< Align.getQuantity()) ||
(ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) {
// Create an aligned temporary, and copy to it.
----------------
rjmccall wrote:
> yaxunl wrote:
> > rjmccall wrote:
> > > This should be comparing AST address spaces.
> > The AST address space of RV cannot be obtained through `CGFunctionInfo::const_arg_iterator it` and `it->type` since `it->type` takes type of
> >
> >
> > ```
> > ImplicitCastExpr 0x60a9ff0 <col:5> 'struct S':'struct S' <LValueToRValue>
> > `-DeclRefExpr 0x60a9f28 <col:5> '__global struct S':'__global struct S' lvalue Var 0x607efb0
> > ```
> >
> > and the original addr space is lost due to LValueToRValue cast.
> >
> > To get the AST addr space of RV, it seems I need to save the argument Expr in CallArgList and get it from Expr.
> >
> I think your last two comments are related. I'm not sure why we haven't copied into a temporary here, and if we had, the assumption of LangAS::Default would be fine. Would you mind doing the investigation there?
It seems the backend will insert a temp copy for byval arguments, therefore normally a byval argument does not need caller to create a temp copy in LLVM IR. An explicit temp copy is only needed for special cases, e.g. alignment mismatch with ABI.
For example, the following C program,
```
struct S {
long x[100];
};
struct S g_s;
void f(struct S s);
void g() {
f(g_s);
}
```
will generate the following IR on x86_64:
```
target triple = "x86_64-unknown-linux-gnu"
%struct.S = type { [100 x i64] }
@g_s = common global %struct.S zeroinitializer, align 8
; Function Attrs: noinline nounwind optnone
define void @g() #0 {
entry:
call void @f(%struct.S* byval align 8 @g_s)
ret void
}
declare void @f(%struct.S* byval align 8) #1
```
However, the following C program
```
struct S {
int x[100];
};
struct S g_s;
void f(struct S s);
void g() {
f(g_s);
}
```
will generate the following IR
```
target triple = "x86_64-unknown-linux-gnu"
%struct.S = type { [100 x i32] }
@g_s = common global %struct.S zeroinitializer, align 4
; Function Attrs: noinline nounwind optnone
define void @g() #0 {
entry:
%byval-temp = alloca %struct.S, align 8
%0 = bitcast %struct.S* %byval-temp to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* bitcast (%struct.S* @g_s to i8*), i64 400, i32 4, i1 false)
call void @f(%struct.S* byval align 8 %byval-temp)
ret void
}
declare void @f(%struct.S* byval align 8) #1
```
The temp var is generated by line 3863. The control flow reaches line 3863 because the alignment of the argument is 4 but the ABI requires it to be 8, so a temp is created to match the ABI align requirement.
That means, in the OpenCL example, it is normal that a temp var is not generated at line 3848. The temp is supposed to be generated at line 3863 too, like the C example.
https://reviews.llvm.org/D34367
More information about the cfe-commits
mailing list