[PATCH] D34367: CodeGen: Fix address space of indirect function argument

Yaxun Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 31 10:15:02 PDT 2017


yaxunl added inline comments.


================
Comment at: lib/CodeGen/CGCall.cpp:3861
                < Align.getQuantity()) ||
             (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) {
           // Create an aligned temporary, and copy to it.
----------------
yaxunl wrote:
> rjmccall wrote:
> > yaxunl wrote:
> > > rjmccall wrote:
> > > > This should be comparing AST address spaces.
> > > The AST address space of RV cannot be obtained through `CGFunctionInfo::const_arg_iterator it` and `it->type` since `it->type` takes type of 
> > > 
> > > 
> > > ```
> > > ImplicitCastExpr 0x60a9ff0 <col:5> 'struct S':'struct S' <LValueToRValue>
> > >     `-DeclRefExpr 0x60a9f28 <col:5> '__global struct S':'__global struct S' lvalue Var 0x607efb0
> > > ```
> > > 
> > > and the original addr space is lost due to LValueToRValue cast.
> > > 
> > > To get the AST addr space of RV, it seems I need to save the argument Expr in CallArgList and get it from Expr.
> > > 
> > I think your last two comments are related.  I'm not sure why we haven't copied into a temporary here, and if we had, the assumption of LangAS::Default would be fine.  Would you mind doing the investigation there?
> It seems the backend will insert a temp copy for byval arguments, therefore normally a byval argument does not need caller to create a temp copy in LLVM IR. An explicit temp copy is only needed for special cases, e.g. alignment mismatch with ABI.
> 
> For example, the following C program,
> 
> 
> ```
> struct S {
>   long x[100];
> };
> 
> struct S g_s;
> 
> void f(struct S s);
> 
> void g() {
>   f(g_s);
> }
> 
> ```
> 
> will generate the following IR on x86_64:
> 
> 
> ```
> target triple = "x86_64-unknown-linux-gnu"
> 
> %struct.S = type { [100 x i64] }
> 
> @g_s = common global %struct.S zeroinitializer, align 8
> 
> ; Function Attrs: noinline nounwind optnone
> define void @g() #0 {
> entry:
>   call void @f(%struct.S* byval align 8 @g_s)
>   ret void
> }
> 
> declare void @f(%struct.S* byval align 8) #1
> 
> ```
> 
> However, the following C program
> 
> 
> ```
> struct S {
>   int x[100];
> };
> 
> struct S g_s;
> 
> void f(struct S s);
> 
> void g() {
>   f(g_s);
> }
> 
> ```
> 
> will generate the following IR
> 
> 
> ```
> target triple = "x86_64-unknown-linux-gnu"
> 
> %struct.S = type { [100 x i32] }
> 
> @g_s = common global %struct.S zeroinitializer, align 4
> 
> ; Function Attrs: noinline nounwind optnone
> define void @g() #0 {
> entry:
>   %byval-temp = alloca %struct.S, align 8
>   %0 = bitcast %struct.S* %byval-temp to i8*
>   call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* bitcast (%struct.S* @g_s to i8*), i64 400, i32 4, i1 false)
>   call void @f(%struct.S* byval align 8 %byval-temp)
>   ret void
> }
> 
> declare void @f(%struct.S* byval align 8) #1
> 
> ```
> 
> The temp var is generated by line 3863. The control flow reaches line 3863 because the alignment of the argument is 4 but the ABI requires it to be 8, so a temp is created to match the ABI align requirement.
> 
> That means, in the OpenCL example, it is normal that a temp var is not generated at line 3848. The temp is supposed to be generated at line 3863 too, like the C example.
For C++, it is different story, the AST is

```
`-FunctionDecl 0x64abee0 <line:9:1, line:11:1> line:9:6 g 'void (void)'
  `-CompoundStmt 0x64ac388 <col:10, line:11:1>
    `-CallExpr 0x64ac060 <line:10:3, col:8> 'void'
      |-ImplicitCastExpr 0x64ac048 <col:3> 'void (*)(struct S)' <FunctionToPointerDecay>
      | `-DeclRefExpr 0x64abff8 <col:3> 'void (struct S)' lvalue Function 0x64abe20 'f' 'void (struct S)'
      `-CXXConstructExpr 0x64ac278 <col:5> 'struct S' 'void (const struct S &) throw()'
        `-ImplicitCastExpr 0x64ac090 <col:5> 'const struct S' lvalue <NoOp>
          `-DeclRefExpr 0x64abfd0 <col:5> 'struct S' lvalue Var 0x64ab938 'g_s' 'struct S'

```

The argument is a copy constructor expr, therefore a temporary copy is always created, which is in alloca addr space.

For C and OpenCL, since there is no copy constructor, the temporary copy is not created.


https://reviews.llvm.org/D34367





More information about the cfe-commits mailing list