r326946 - CodeGen: Fix address space of indirect function argument

Richard Smith via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 9 17:51:29 PST 2018


Hi,

This change increases the size of a CallArg, and thus that of a
CallArgList, dramatically (an LValue is *much* larger than an RValue, and
unlike an RValue, does not appear to be at all optimized for size). This
results in CallArgList (which contains inline storage for 16 CallArgs)
ballooning in size from ~500 bytes to 2KB, resulting in stack overflows on
programs with deep ASTs.

Given that this introduces a regression (due to stack overflow), I've
reverted in r327195. Sorry about that.

The program it broke looked something like this:

struct VectorBuilder {
  VectorBuilder &operator,(int);
};
void f() {
  VectorBuilder(),

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,

1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0;
}

using Eigen's somewhat-ridiculous comma initializer technique (
https://eigen.tuxfamily.org/dox/group__TutorialAdvancedInitialization.html)
to build an approx 40 x 40 array.

An AST 1600 levels deep is somewhat extreme, but it still seems like
something we should be able to handle within our default 8MB stack limit.

On 7 March 2018 at 13:45, Yaxun Liu via cfe-commits <
cfe-commits at lists.llvm.org> wrote:

> Author: yaxunl
> Date: Wed Mar  7 13:45:40 2018
> New Revision: 326946
>
> URL: http://llvm.org/viewvc/llvm-project?rev=326946&view=rev
> Log:
> CodeGen: Fix address space of indirect function argument
>
> The indirect function argument is in alloca address space in LLVM IR.
> However,
> during Clang codegen for C++, the address space of indirect function
> argument
> should match its address space in the source code, i.e., default addr
> space, even
> for indirect argument. This is because destructor of the indirect argument
> may
> be called in the caller function, and address of the indirect argument may
> be
> taken, in either case the indirect function argument is expected to be in
> default
> addr space, not the alloca address space.
>
> Therefore, the indirect function argument should be mapped to the temp var
> casted to default address space. The caller will cast it to alloca addr
> space
> when passing it to the callee. In the callee, the argument is also casted
> to the
> default address space and used.
>
> CallArg is refactored to facilitate this fix.
>
> Differential Revision: https://reviews.llvm.org/D34367
>
> Added:
>     cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp
> Modified:
>     cfe/trunk/lib/CodeGen/CGAtomic.cpp
>     cfe/trunk/lib/CodeGen/CGCall.cpp
>     cfe/trunk/lib/CodeGen/CGCall.h
>     cfe/trunk/lib/CodeGen/CGClass.cpp
>     cfe/trunk/lib/CodeGen/CGDecl.cpp
>     cfe/trunk/lib/CodeGen/CGExprCXX.cpp
>     cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp
>     cfe/trunk/lib/CodeGen/CGObjCGNU.cpp
>     cfe/trunk/lib/CodeGen/CGObjCMac.cpp
>     cfe/trunk/lib/CodeGen/CodeGenFunction.h
>     cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp
>     cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp
>     cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl
>     cfe/trunk/test/CodeGenOpenCL/byval.cl
>
> Modified: cfe/trunk/lib/CodeGen/CGAtomic.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGAtomic.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGAtomic.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGAtomic.cpp Wed Mar  7 13:45:40 2018
> @@ -1160,7 +1160,7 @@ RValue CodeGenFunction::EmitAtomicExpr(A
>      if (UseOptimizedLibcall && Res.getScalarVal()) {
>        llvm::Value *ResVal = Res.getScalarVal();
>        if (PostOp) {
> -        llvm::Value *LoadVal1 = Args[1].RV.getScalarVal();
> +        llvm::Value *LoadVal1 = Args[1].getRValue(*this).getScalarVal();
>          ResVal = Builder.CreateBinOp(PostOp, ResVal, LoadVal1);
>        }
>        if (E->getOp() == AtomicExpr::AO__atomic_nand_fetch)
>
> Modified: cfe/trunk/lib/CodeGen/CGCall.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGCall.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGCall.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGCall.cpp Wed Mar  7 13:45:40 2018
> @@ -1040,42 +1040,49 @@ void CodeGenFunction::ExpandTypeFromArgs
>  }
>
>  void CodeGenFunction::ExpandTypeToArgs(
> -    QualType Ty, RValue RV, llvm::FunctionType *IRFuncTy,
> +    QualType Ty, CallArg Arg, llvm::FunctionType *IRFuncTy,
>      SmallVectorImpl<llvm::Value *> &IRCallArgs, unsigned &IRCallArgPos) {
>    auto Exp = getTypeExpansion(Ty, getContext());
>    if (auto CAExp = dyn_cast<ConstantArrayExpansion>(Exp.get())) {
> -    forConstantArrayExpansion(*this, CAExp, RV.getAggregateAddress(),
> -                              [&](Address EltAddr) {
> -      RValue EltRV =
> -          convertTempToRValue(EltAddr, CAExp->EltTy, SourceLocation());
> -      ExpandTypeToArgs(CAExp->EltTy, EltRV, IRFuncTy, IRCallArgs,
> IRCallArgPos);
> -    });
> +    Address Addr = Arg.hasLValue() ? Arg.getKnownLValue().getAddress()
> +                                   : Arg.getKnownRValue().
> getAggregateAddress();
> +    forConstantArrayExpansion(
> +        *this, CAExp, Addr, [&](Address EltAddr) {
> +          CallArg EltArg = CallArg(
> +              convertTempToRValue(EltAddr, CAExp->EltTy,
> SourceLocation()),
> +              CAExp->EltTy);
> +          ExpandTypeToArgs(CAExp->EltTy, EltArg, IRFuncTy, IRCallArgs,
> +                           IRCallArgPos);
> +        });
>    } else if (auto RExp = dyn_cast<RecordExpansion>(Exp.get())) {
> -    Address This = RV.getAggregateAddress();
> +    Address This = Arg.hasLValue() ? Arg.getKnownLValue().getAddress()
> +                                   : Arg.getKnownRValue().
> getAggregateAddress();
>      for (const CXXBaseSpecifier *BS : RExp->Bases) {
>        // Perform a single step derived-to-base conversion.
>        Address Base =
>            GetAddressOfBaseClass(This, Ty->getAsCXXRecordDecl(), &BS, &BS
> + 1,
>                                  /*NullCheckValue=*/false,
> SourceLocation());
> -      RValue BaseRV = RValue::getAggregate(Base);
> +      CallArg BaseArg = CallArg(RValue::getAggregate(Base),
> BS->getType());
>
>        // Recurse onto bases.
> -      ExpandTypeToArgs(BS->getType(), BaseRV, IRFuncTy, IRCallArgs,
> +      ExpandTypeToArgs(BS->getType(), BaseArg, IRFuncTy, IRCallArgs,
>                         IRCallArgPos);
>      }
>
>      LValue LV = MakeAddrLValue(This, Ty);
>      for (auto FD : RExp->Fields) {
> -      RValue FldRV = EmitRValueForField(LV, FD, SourceLocation());
> -      ExpandTypeToArgs(FD->getType(), FldRV, IRFuncTy, IRCallArgs,
> +      CallArg FldArg =
> +          CallArg(EmitRValueForField(LV, FD, SourceLocation()),
> FD->getType());
> +      ExpandTypeToArgs(FD->getType(), FldArg, IRFuncTy, IRCallArgs,
>                         IRCallArgPos);
>      }
>    } else if (isa<ComplexExpansion>(Exp.get())) {
> -    ComplexPairTy CV = RV.getComplexVal();
> +    ComplexPairTy CV = Arg.getKnownRValue().getComplexVal();
>      IRCallArgs[IRCallArgPos++] = CV.first;
>      IRCallArgs[IRCallArgPos++] = CV.second;
>    } else {
>      assert(isa<NoExpansion>(Exp.get()));
> +    auto RV = Arg.getKnownRValue();
>      assert(RV.isScalar() &&
>             "Unexpected non-scalar rvalue during struct expansion.");
>
> @@ -3418,13 +3425,17 @@ void CodeGenFunction::EmitCallArgs(
>      assert(InitialArgSize + 1 == Args.size() &&
>             "The code below depends on only adding one arg per
> EmitCallArg");
>      (void)InitialArgSize;
> -    RValue RVArg = Args.back().RV;
> -    EmitNonNullArgCheck(RVArg, ArgTypes[Idx], (*Arg)->getExprLoc(), AC,
> -                        ParamsToSkip + Idx);
> -    // @llvm.objectsize should never have side-effects and shouldn't need
> -    // destruction/cleanups, so we can safely "emit" it after its arg,
> -    // regardless of right-to-leftness
> -    MaybeEmitImplicitObjectSize(Idx, *Arg, RVArg);
> +    // Since pointer argument are never emitted as LValue, it is safe to
> emit
> +    // non-null argument check for r-value only.
> +    if (!Args.back().hasLValue()) {
> +      RValue RVArg = Args.back().getKnownRValue();
> +      EmitNonNullArgCheck(RVArg, ArgTypes[Idx], (*Arg)->getExprLoc(), AC,
> +                          ParamsToSkip + Idx);
> +      // @llvm.objectsize should never have side-effects and shouldn't
> need
> +      // destruction/cleanups, so we can safely "emit" it after its arg,
> +      // regardless of right-to-leftness
> +      MaybeEmitImplicitObjectSize(Idx, *Arg, RVArg);
> +    }
>    }
>
>    if (!LeftToRight) {
> @@ -3471,6 +3482,31 @@ struct DisableDebugLocationUpdates {
>
>  } // end anonymous namespace
>
> +RValue CallArg::getRValue(CodeGenFunction &CGF) const {
> +  if (!HasLV)
> +    return RV;
> +  LValue Copy = CGF.MakeAddrLValue(CGF.CreateMemTemp(Ty), Ty);
> +  CGF.EmitAggregateCopy(Copy, LV, Ty, LV.isVolatile());
> +  IsUsed = true;
> +  return RValue::getAggregate(Copy.getAddress());
> +}
> +
> +void CallArg::copyInto(CodeGenFunction &CGF, Address Addr) const {
> +  LValue Dst = CGF.MakeAddrLValue(Addr, Ty);
> +  if (!HasLV && RV.isScalar())
> +    CGF.EmitStoreOfScalar(RV.getScalarVal(), Dst, /*init=*/true);
> +  else if (!HasLV && RV.isComplex())
> +    CGF.EmitStoreOfComplex(RV.getComplexVal(), Dst, /*init=*/true);
> +  else {
> +    auto Addr = HasLV ? LV.getAddress() : RV.getAggregateAddress();
> +    LValue SrcLV = CGF.MakeAddrLValue(Addr, Ty);
> +    CGF.EmitAggregateCopy(Dst, SrcLV, Ty,
> +                          HasLV ? LV.isVolatileQualified()
> +                                : RV.isVolatileQualified());
> +  }
> +  IsUsed = true;
> +}
> +
>  void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
>                                    QualType type) {
>    DisableDebugLocationUpdates Dis(*this, E);
> @@ -3536,15 +3572,7 @@ void CodeGenFunction::EmitCallArg(CallAr
>        cast<CastExpr>(E)->getCastKind() == CK_LValueToRValue) {
>      LValue L = EmitLValue(cast<CastExpr>(E)->getSubExpr());
>      assert(L.isSimple());
> -    if (L.getAlignment() >= getContext().getTypeAlignInChars(type)) {
> -      args.add(L.asAggregateRValue(), type, /*NeedsCopy*/true);
> -    } else {
> -      // We can't represent a misaligned lvalue in the CallArgList, so
> copy
> -      // to an aligned temporary now.
> -      LValue Dest = MakeAddrLValue(CreateMemTemp(type), type);
> -      EmitAggregateCopy(Dest, L, type, L.isVolatile());
> -      args.add(RValue::getAggregate(Dest.getAddress()), type);
> -    }
> +    args.addUncopiedAggregate(L, type);
>      return;
>    }
>
> @@ -3702,16 +3730,6 @@ CodeGenFunction::EmitCallOrInvoke(llvm::
>    return llvm::CallSite(Inst);
>  }
>
> -/// \brief Store a non-aggregate value to an address to initialize it.
> For
> -/// initialization, a non-atomic store will be used.
> -static void EmitInitStoreOfNonAggregate(CodeGenFunction &CGF, RValue Src,
> -                                        LValue Dst) {
> -  if (Src.isScalar())
> -    CGF.EmitStoreOfScalar(Src.getScalarVal(), Dst, /*init=*/true);
> -  else
> -    CGF.EmitStoreOfComplex(Src.getComplexVal(), Dst, /*init=*/true);
> -}
> -
>  void CodeGenFunction::deferPlaceholderReplacement(llvm::Instruction *Old,
>                                                    llvm::Value *New) {
>    DeferredReplacements.push_back(std::make_pair(Old, New));
> @@ -3804,7 +3822,6 @@ RValue CodeGenFunction::EmitCall(const C
>    for (CallArgList::const_iterator I = CallArgs.begin(), E =
> CallArgs.end();
>         I != E; ++I, ++info_it, ++ArgNo) {
>      const ABIArgInfo &ArgInfo = info_it->info;
> -    RValue RV = I->RV;
>
>      // Insert a padding argument to ensure proper alignment.
>      if (IRFunctionArgs.hasPaddingArg(ArgNo))
> @@ -3818,13 +3835,16 @@ RValue CodeGenFunction::EmitCall(const C
>      case ABIArgInfo::InAlloca: {
>        assert(NumIRArgs == 0);
>        assert(getTarget().getTriple().getArch() == llvm::Triple::x86);
> -      if (RV.isAggregate()) {
> +      if (I->isAggregate()) {
>          // Replace the placeholder with the appropriate argument slot GEP.
> +        Address Addr = I->hasLValue()
> +                           ? I->getKnownLValue().getAddress()
> +                           : I->getKnownRValue().getAggregateAddress();
>          llvm::Instruction *Placeholder =
> -            cast<llvm::Instruction>(RV.getAggregatePointer());
> +            cast<llvm::Instruction>(Addr.getPointer());
>          CGBuilderTy::InsertPoint IP = Builder.saveIP();
>          Builder.SetInsertPoint(Placeholder);
> -        Address Addr = createInAllocaStructGEP(
> ArgInfo.getInAllocaFieldIndex());
> +        Addr = createInAllocaStructGEP(ArgInfo.getInAllocaFieldIndex());
>          Builder.restoreIP(IP);
>          deferPlaceholderReplacement(Placeholder, Addr.getPointer());
>        } else {
> @@ -3837,22 +3857,20 @@ RValue CodeGenFunction::EmitCall(const C
>          // from {}* to (%struct.foo*)*.
>          if (Addr.getType() != MemType)
>            Addr = Builder.CreateBitCast(Addr, MemType);
> -        LValue argLV = MakeAddrLValue(Addr, I->Ty);
> -        EmitInitStoreOfNonAggregate(*this, RV, argLV);
> +        I->copyInto(*this, Addr);
>        }
>        break;
>      }
>
>      case ABIArgInfo::Indirect: {
>        assert(NumIRArgs == 1);
> -      if (RV.isScalar() || RV.isComplex()) {
> +      if (!I->isAggregate()) {
>          // Make a temporary alloca to pass the argument.
>          Address Addr = CreateMemTemp(I->Ty, ArgInfo.getIndirectAlign(),
>                                       "indirect-arg-temp", false);
>          IRCallArgs[FirstIRArg] = Addr.getPointer();
>
> -        LValue argLV = MakeAddrLValue(Addr, I->Ty);
> -        EmitInitStoreOfNonAggregate(*this, RV, argLV);
> +        I->copyInto(*this, Addr);
>        } else {
>          // We want to avoid creating an unnecessary temporary+copy here;
>          // however, we need one in three cases:
> @@ -3860,32 +3878,51 @@ RValue CodeGenFunction::EmitCall(const C
>          //    source.  (This case doesn't occur on any common
> architecture.)
>          // 2. If the argument is byval, RV is not sufficiently aligned,
> and
>          //    we cannot force it to be sufficiently aligned.
> -        // 3. If the argument is byval, but RV is located in an address
> space
> -        //    different than that of the argument (0).
> -        Address Addr = RV.getAggregateAddress();
> +        // 3. If the argument is byval, but RV is not located in default
> +        //    or alloca address space.
> +        Address Addr = I->hasLValue()
> +                           ? I->getKnownLValue().getAddress()
> +                           : I->getKnownRValue().getAggregateAddress();
> +        llvm::Value *V = Addr.getPointer();
>          CharUnits Align = ArgInfo.getIndirectAlign();
>          const llvm::DataLayout *TD = &CGM.getDataLayout();
> -        const unsigned RVAddrSpace = Addr.getType()->getAddressSpace();
> -        const unsigned ArgAddrSpace =
> -            (FirstIRArg < IRFuncTy->getNumParams()
> -                 ? IRFuncTy->getParamType(FirstIRArg)->
> getPointerAddressSpace()
> -                 : 0);
> -        if ((!ArgInfo.getIndirectByVal() && I->NeedsCopy) ||
> -            (ArgInfo.getIndirectByVal() && Addr.getAlignment() < Align &&
> -             llvm::getOrEnforceKnownAlignment(Addr.getPointer(),
> -                                              Align.getQuantity(), *TD)
> -               < Align.getQuantity()) ||
> -            (ArgInfo.getIndirectByVal() && (RVAddrSpace !=
> ArgAddrSpace))) {
> +
> +        assert((FirstIRArg >= IRFuncTy->getNumParams() ||
> +                IRFuncTy->getParamType(FirstIRArg)->getPointerAddressSpace()
> ==
> +                    TD->getAllocaAddrSpace()) &&
> +               "indirect argument must be in alloca address space");
> +
> +        bool NeedCopy = false;
> +
> +        if (Addr.getAlignment() < Align &&
> +            llvm::getOrEnforceKnownAlignment(V, Align.getQuantity(),
> *TD) <
> +                Align.getQuantity()) {
> +          NeedCopy = true;
> +        } else if (I->hasLValue()) {
> +          auto LV = I->getKnownLValue();
> +          auto AS = LV.getAddressSpace();
> +          if ((!ArgInfo.getIndirectByVal() &&
> +               (LV.getAlignment() >=
> +                getContext().getTypeAlignInChars(I->Ty))) ||
> +              (ArgInfo.getIndirectByVal() &&
> +               ((AS != LangAS::Default && AS != LangAS::opencl_private &&
> +                 AS != CGM.getASTAllocaAddressSpace())))) {
> +            NeedCopy = true;
> +          }
> +        }
> +        if (NeedCopy) {
>            // Create an aligned temporary, and copy to it.
>            Address AI = CreateMemTemp(I->Ty, ArgInfo.getIndirectAlign(),
>                                       "byval-temp", false);
>            IRCallArgs[FirstIRArg] = AI.getPointer();
> -          LValue Dest = MakeAddrLValue(AI, I->Ty);
> -          LValue Src = MakeAddrLValue(Addr, I->Ty);
> -          EmitAggregateCopy(Dest, Src, I->Ty, RV.isVolatileQualified());
> +          I->copyInto(*this, AI);
>          } else {
>            // Skip the extra memcpy call.
> -          IRCallArgs[FirstIRArg] = Addr.getPointer();
> +          auto *T = V->getType()->getPointerElementType()->getPointerTo(
> +              CGM.getDataLayout().getAllocaAddrSpace());
> +          IRCallArgs[FirstIRArg] = getTargetHooks().performAddrSpaceCast(
> +              *this, V, LangAS::Default, CGM.getASTAllocaAddressSpace(),
> T,
> +              true);
>          }
>        }
>        break;
> @@ -3902,10 +3939,12 @@ RValue CodeGenFunction::EmitCall(const C
>            ArgInfo.getDirectOffset() == 0) {
>          assert(NumIRArgs == 1);
>          llvm::Value *V;
> -        if (RV.isScalar())
> -          V = RV.getScalarVal();
> +        if (!I->isAggregate())
> +          V = I->getKnownRValue().getScalarVal();
>          else
> -          V = Builder.CreateLoad(RV.getAggregateAddress());
> +          V = Builder.CreateLoad(
> +              I->hasLValue() ? I->getKnownLValue().getAddress()
> +                             : I->getKnownRValue().
> getAggregateAddress());
>
>          // Implement swifterror by copying into a new swifterror argument.
>          // We'll write back in the normal path out of the call.
> @@ -3943,12 +3982,12 @@ RValue CodeGenFunction::EmitCall(const C
>
>        // FIXME: Avoid the conversion through memory if possible.
>        Address Src = Address::invalid();
> -      if (RV.isScalar() || RV.isComplex()) {
> +      if (!I->isAggregate()) {
>          Src = CreateMemTemp(I->Ty, "coerce");
> -        LValue SrcLV = MakeAddrLValue(Src, I->Ty);
> -        EmitInitStoreOfNonAggregate(*this, RV, SrcLV);
> +        I->copyInto(*this, Src);
>        } else {
> -        Src = RV.getAggregateAddress();
> +        Src = I->hasLValue() ? I->getKnownLValue().getAddress()
> +                             : I->getKnownRValue().getAggregateAddress();
>        }
>
>        // If the value is offset in memory, apply the offset now.
> @@ -4002,9 +4041,12 @@ RValue CodeGenFunction::EmitCall(const C
>
>        llvm::Value *tempSize = nullptr;
>        Address addr = Address::invalid();
> -      if (RV.isAggregate()) {
> -        addr = RV.getAggregateAddress();
> +      if (I->isAggregate()) {
> +        addr = I->hasLValue() ? I->getKnownLValue().getAddress()
> +                              : I->getKnownRValue().
> getAggregateAddress();
> +
>        } else {
> +        RValue RV = I->getKnownRValue();
>          assert(RV.isScalar()); // complex should always just be direct
>
>          llvm::Type *scalarType = RV.getScalarVal()->getType();
> @@ -4041,7 +4083,7 @@ RValue CodeGenFunction::EmitCall(const C
>
>      case ABIArgInfo::Expand:
>        unsigned IRArgPos = FirstIRArg;
> -      ExpandTypeToArgs(I->Ty, RV, IRFuncTy, IRCallArgs, IRArgPos);
> +      ExpandTypeToArgs(I->Ty, *I, IRFuncTy, IRCallArgs, IRArgPos);
>        assert(IRArgPos == FirstIRArg + NumIRArgs);
>        break;
>      }
> @@ -4393,7 +4435,7 @@ RValue CodeGenFunction::EmitCall(const C
>                                OffsetValue);
>      } else if (const auto *AA = TargetDecl->getAttr<AllocAlignAttr>()) {
>        llvm::Value *ParamVal =
> -          CallArgs[AA->getParamIndex() - 1].RV.getScalarVal();
> +          CallArgs[AA->getParamIndex() - 1].getRValue(*this).
> getScalarVal();
>        EmitAlignmentAssumption(Ret.getScalarVal(), ParamVal);
>      }
>    }
>
> Modified: cfe/trunk/lib/CodeGen/CGCall.h
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGCall.h?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGCall.h (original)
> +++ cfe/trunk/lib/CodeGen/CGCall.h Wed Mar  7 13:45:40 2018
> @@ -213,12 +213,46 @@ public:
>    };
>
>    struct CallArg {
> -    RValue RV;
> +  private:
> +    union {
> +      RValue RV;
> +      LValue LV; /// The argument is semantically a load from this
> l-value.
> +    };
> +    bool HasLV;
> +
> +    /// A data-flow flag to make sure getRValue and/or copyInto are not
> +    /// called twice for duplicated IR emission.
> +    mutable bool IsUsed;
> +
> +  public:
>      QualType Ty;
> -    bool NeedsCopy;
> -    CallArg(RValue rv, QualType ty, bool needscopy)
> -    : RV(rv), Ty(ty), NeedsCopy(needscopy)
> -    { }
> +    CallArg(RValue rv, QualType ty)
> +        : RV(rv), HasLV(false), IsUsed(false), Ty(ty) {}
> +    CallArg(LValue lv, QualType ty)
> +        : LV(lv), HasLV(true), IsUsed(false), Ty(ty) {}
> +    bool hasLValue() const { return HasLV; }
> +    QualType getType() const { return Ty; }
> +
> +    /// \returns an independent RValue. If the CallArg contains an LValue,
> +    /// a temporary copy is returned.
> +    RValue getRValue(CodeGenFunction &CGF) const;
> +
> +    LValue getKnownLValue() const {
> +      assert(HasLV && !IsUsed);
> +      return LV;
> +    }
> +    RValue getKnownRValue() const {
> +      assert(!HasLV && !IsUsed);
> +      return RV;
> +    }
> +    void setRValue(RValue _RV) {
> +      assert(!HasLV);
> +      RV = _RV;
> +    }
> +
> +    bool isAggregate() const { return HasLV || RV.isAggregate(); }
> +
> +    void copyInto(CodeGenFunction &CGF, Address A) const;
>    };
>
>    /// CallArgList - Type for representing both the value and type of
> @@ -248,8 +282,10 @@ public:
>        llvm::Instruction *IsActiveIP;
>      };
>
> -    void add(RValue rvalue, QualType type, bool needscopy = false) {
> -      push_back(CallArg(rvalue, type, needscopy));
> +    void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue,
> type)); }
> +
> +    void addUncopiedAggregate(LValue LV, QualType type) {
> +      push_back(CallArg(LV, type));
>      }
>
>      /// Add all the arguments from another CallArgList to this one. After
> doing
>
> Modified: cfe/trunk/lib/CodeGen/CGClass.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGClass.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGClass.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGClass.cpp Wed Mar  7 13:45:40 2018
> @@ -2077,7 +2077,8 @@ void CodeGenFunction::EmitCXXConstructor
>      assert(Args.size() == 2 && "unexpected argcount for trivial ctor");
>
>      QualType SrcTy = D->getParamDecl(0)->getType().getNonReferenceType();
> -    Address Src(Args[1].RV.getScalarVal(), getNaturalTypeAlignment(SrcTy)
> );
> +    Address Src(Args[1].getRValue(*this).getScalarVal(),
> +                getNaturalTypeAlignment(SrcTy));
>      LValue SrcLVal = MakeAddrLValue(Src, SrcTy);
>      QualType DestTy = getContext().getTypeDeclType(ClassDecl);
>      LValue DestLVal = MakeAddrLValue(This, DestTy);
> @@ -2131,8 +2132,7 @@ void CodeGenFunction::EmitInheritedCXXCo
>      const CXXConstructorDecl *D, bool ForVirtualBase, Address This,
>      bool InheritedFromVBase, const CXXInheritedCtorInitExpr *E) {
>    CallArgList Args;
> -  CallArg ThisArg(RValue::get(This.getPointer()),
> D->getThisType(getContext()),
> -                  /*NeedsCopy=*/false);
> +  CallArg ThisArg(RValue::get(This.getPointer()),
> D->getThisType(getContext()));
>
>    // Forward the parameters.
>    if (InheritedFromVBase &&
> @@ -2196,7 +2196,7 @@ void CodeGenFunction::EmitInlinedInherit
>    assert(Args.size() >= Params.size() && "too few arguments for call");
>    for (unsigned I = 0, N = Args.size(); I != N; ++I) {
>      if (I < Params.size() && isa<ImplicitParamDecl>(Params[I])) {
> -      const RValue &RV = Args[I].RV;
> +      const RValue &RV = Args[I].getRValue(*this);
>        assert(!RV.isComplex() && "complex indirect params not supported");
>        ParamValue Val = RV.isScalar()
>                             ? ParamValue::forDirect(RV.getScalarVal())
>
> Modified: cfe/trunk/lib/CodeGen/CGDecl.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGDecl.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGDecl.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGDecl.cpp Wed Mar  7 13:45:40 2018
> @@ -1882,6 +1882,22 @@ void CodeGenFunction::EmitParmDecl(const
>      llvm::Type *IRTy = ConvertTypeForMem(Ty)->getPointerTo(AS);
>      if (DeclPtr.getType() != IRTy)
>        DeclPtr = Builder.CreateBitCast(DeclPtr, IRTy, D.getName());
> +    // Indirect argument is in alloca address space, which may be
> different
> +    // from the default address space.
> +    auto AllocaAS = CGM.getASTAllocaAddressSpace();
> +    auto *V = DeclPtr.getPointer();
> +    auto SrcLangAS = getLangOpts().OpenCL ? LangAS::opencl_private :
> AllocaAS;
> +    auto DestLangAS =
> +        getLangOpts().OpenCL ? LangAS::opencl_private : LangAS::Default;
> +    if (SrcLangAS != DestLangAS) {
> +      assert(getContext().getTargetAddressSpace(SrcLangAS) ==
> +             CGM.getDataLayout().getAllocaAddrSpace());
> +      auto DestAS = getContext().getTargetAddressSpace(DestLangAS);
> +      auto *T = V->getType()->getPointerElementType()->
> getPointerTo(DestAS);
> +      DeclPtr = Address(getTargetHooks().performAddrSpaceCast(
> +                            *this, V, SrcLangAS, DestLangAS, T, true),
> +                        DeclPtr.getAlignment());
> +    }
>
>      // Push a destructor cleanup for this parameter if the ABI requires
> it.
>      // Don't push a cleanup in a thunk for a method that will also emit a
>
> Modified: cfe/trunk/lib/CodeGen/CGExprCXX.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGExprCXX.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGExprCXX.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGExprCXX.cpp Wed Mar  7 13:45:40 2018
> @@ -265,7 +265,7 @@ RValue CodeGenFunction::EmitCXXMemberOrO
>          // when it isn't necessary; just produce the proper effect here.
>          LValue RHS = isa<CXXOperatorCallExpr>(CE)
>                           ? MakeNaturalAlignAddrLValue(
> -                               (*RtlArgs)[0].RV.getScalarVal(),
> +                               (*RtlArgs)[0].getRValue(*this)
> .getScalarVal(),
>                                 (*(CE->arg_begin() + 1))->getType())
>                           : EmitLValue(*CE->arg_begin());
>          EmitAggregateAssign(This, RHS, CE->getType());
> @@ -1490,7 +1490,7 @@ static void EnterNewDeleteCleanup(CodeGe
>                                             AllocAlign);
>      for (unsigned I = 0, N = E->getNumPlacementArgs(); I != N; ++I) {
>        auto &Arg = NewArgs[I + NumNonPlacementArgs];
> -      Cleanup->setPlacementArg(I, Arg.RV, Arg.Ty);
> +      Cleanup->setPlacementArg(I, Arg.getRValue(CGF), Arg.Ty);
>      }
>
>      return;
> @@ -1521,8 +1521,8 @@ static void EnterNewDeleteCleanup(CodeGe
>                                                AllocAlign);
>    for (unsigned I = 0, N = E->getNumPlacementArgs(); I != N; ++I) {
>      auto &Arg = NewArgs[I + NumNonPlacementArgs];
> -    Cleanup->setPlacementArg(I, DominatingValue<RValue>::save(CGF,
> Arg.RV),
> -                             Arg.Ty);
> +    Cleanup->setPlacementArg(
> +        I, DominatingValue<RValue>::save(CGF, Arg.getRValue(CGF)),
> Arg.Ty);
>    }
>
>    CGF.initFullExprCleanup();
>
> Modified: cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGGPUBuiltin.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp Wed Mar  7 13:45:40 2018
> @@ -83,8 +83,9 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa
>                 /* ParamsToSkip = */ 0);
>
>    // We don't know how to emit non-scalar varargs.
> -  if (std::any_of(Args.begin() + 1, Args.end(),
> -                  [](const CallArg &A) { return !A.RV.isScalar(); })) {
> +  if (std::any_of(Args.begin() + 1, Args.end(), [&](const CallArg &A) {
> +        return !A.getRValue(*this).isScalar();
> +      })) {
>      CGM.ErrorUnsupported(E, "non-scalar arg to printf");
>      return RValue::get(llvm::ConstantInt::get(IntTy, 0));
>    }
> @@ -97,7 +98,7 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa
>    } else {
>      llvm::SmallVector<llvm::Type *, 8> ArgTypes;
>      for (unsigned I = 1, NumArgs = Args.size(); I < NumArgs; ++I)
> -      ArgTypes.push_back(Args[I].RV.getScalarVal()->getType());
> +      ArgTypes.push_back(Args[I].getRValue(*this).getScalarVal(
> )->getType());
>
>      // Using llvm::StructType is correct only because printf doesn't
> accept
>      // aggregates.  If we had to handle aggregates here, we'd have to
> manually
> @@ -109,7 +110,7 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa
>
>      for (unsigned I = 1, NumArgs = Args.size(); I < NumArgs; ++I) {
>        llvm::Value *P = Builder.CreateStructGEP(AllocaTy, Alloca, I - 1);
> -      llvm::Value *Arg = Args[I].RV.getScalarVal();
> +      llvm::Value *Arg = Args[I].getRValue(*this).getScalarVal();
>        Builder.CreateAlignedStore(Arg, P, DL.getPrefTypeAlignment(Arg->
> getType()));
>      }
>      BufferPtr = Builder.CreatePointerCast(Alloca,
> llvm::Type::getInt8PtrTy(Ctx));
> @@ -117,6 +118,6 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa
>
>    // Invoke vprintf and return.
>    llvm::Function* VprintfFunc = GetVprintfDeclaration(CGM.getModule());
> -  return RValue::get(
> -      Builder.CreateCall(VprintfFunc, {Args[0].RV.getScalarVal(),
> BufferPtr}));
> +  return RValue::get(Builder.CreateCall(
> +      VprintfFunc, {Args[0].getRValue(*this).getScalarVal(),
> BufferPtr}));
>  }
>
> Modified: cfe/trunk/lib/CodeGen/CGObjCGNU.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGObjCGNU.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGObjCGNU.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGObjCGNU.cpp Wed Mar  7 13:45:40 2018
> @@ -1441,7 +1441,7 @@ CGObjCGNU::GenerateMessageSend(CodeGenFu
>    }
>
>    // Reset the receiver in case the lookup modified it
> -  ActualArgs[0] = CallArg(RValue::get(Receiver), ASTIdTy, false);
> +  ActualArgs[0] = CallArg(RValue::get(Receiver), ASTIdTy);
>
>    imp = EnforceType(Builder, imp, MSI.MessengerType);
>
>
> Modified: cfe/trunk/lib/CodeGen/CGObjCMac.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CGObjCMac.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CGObjCMac.cpp (original)
> +++ cfe/trunk/lib/CodeGen/CGObjCMac.cpp Wed Mar  7 13:45:40 2018
> @@ -1708,7 +1708,7 @@ struct NullReturnState {
>             e = Method->param_end(); i != e; ++i, ++I) {
>          const ParmVarDecl *ParamDecl = (*i);
>          if (ParamDecl->hasAttr<NSConsumedAttr>()) {
> -          RValue RV = I->RV;
> +          RValue RV = I->getRValue(CGF);
>            assert(RV.isScalar() &&
>                   "NullReturnState::complete - arg not on object");
>            CGF.EmitARCRelease(RV.getScalarVal(), ARCImpreciseLifetime);
> @@ -7071,7 +7071,7 @@ CGObjCNonFragileABIMac::EmitVTableMessag
>              CGF.getPointerAlign());
>
>    // Update the message ref argument.
> -  args[1].RV = RValue::get(mref.getPointer());
> +  args[1].setRValue(RValue::get(mref.getPointer()));
>
>    // Load the function to call from the message ref table.
>    Address calleeAddr =
>
> Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.h
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> CodeGenFunction.h?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/CodeGenFunction.h (original)
> +++ cfe/trunk/lib/CodeGen/CodeGenFunction.h Wed Mar  7 13:45:40 2018
> @@ -3899,10 +3899,10 @@ private:
>    void ExpandTypeFromArgs(QualType Ty, LValue Dst,
>                            SmallVectorImpl<llvm::Value *>::iterator &AI);
>
> -  /// ExpandTypeToArgs - Expand an RValue \arg RV, with the LLVM type for
> \arg
> +  /// ExpandTypeToArgs - Expand an CallArg \arg Arg, with the LLVM type
> for \arg
>    /// Ty, into individual arguments on the provided vector \arg
> IRCallArgs,
>    /// starting at index \arg IRCallArgPos. See ABIArgInfo::Expand.
> -  void ExpandTypeToArgs(QualType Ty, RValue RV, llvm::FunctionType
> *IRFuncTy,
> +  void ExpandTypeToArgs(QualType Ty, CallArg Arg, llvm::FunctionType
> *IRFuncTy,
>                          SmallVectorImpl<llvm::Value *> &IRCallArgs,
>                          unsigned &IRCallArgPos);
>
>
> Modified: cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> ItaniumCXXABI.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp (original)
> +++ cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp Wed Mar  7 13:45:40 2018
> @@ -1474,8 +1474,7 @@ CGCXXABI::AddedStructorArgs ItaniumCXXAB
>    llvm::Value *VTT =
>        CGF.GetVTTParameter(GlobalDecl(D, Type), ForVirtualBase,
> Delegating);
>    QualType VTTTy = getContext().getPointerType(getContext().VoidPtrTy);
> -  Args.insert(Args.begin() + 1,
> -              CallArg(RValue::get(VTT), VTTTy, /*needscopy=*/false));
> +  Args.insert(Args.begin() + 1, CallArg(RValue::get(VTT), VTTTy));
>    return AddedStructorArgs::prefix(1);  // Added one arg.
>  }
>
>
> Modified: cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/
> MicrosoftCXXABI.cpp?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp (original)
> +++ cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp Wed Mar  7 13:45:40 2018
> @@ -1537,8 +1537,7 @@ CGCXXABI::AddedStructorArgs MicrosoftCXX
>    }
>    RValue RV = RValue::get(MostDerivedArg);
>    if (FPT->isVariadic()) {
> -    Args.insert(Args.begin() + 1,
> -                CallArg(RV, getContext().IntTy, /*needscopy=*/false));
> +    Args.insert(Args.begin() + 1, CallArg(RV, getContext().IntTy));
>      return AddedStructorArgs::prefix(1);
>    }
>    Args.add(RV, getContext().IntTy);
>
> Added: cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/
> CodeGenCXX/amdgcn-func-arg.cpp?rev=326946&view=auto
> ============================================================
> ==================
> --- cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp (added)
> +++ cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp Wed Mar  7 13:45:40 2018
> @@ -0,0 +1,94 @@
> +// RUN: %clang_cc1 -O0 -triple amdgcn -emit-llvm %s -o - | FileCheck %s
> +
> +class A {
> +public:
> +  int x;
> +  A():x(0) {}
> +  ~A() {}
> +};
> +
> +class B {
> +int x[100];
> +};
> +
> +A g_a;
> +B g_b;
> +
> +void func_with_ref_arg(A &a);
> +void func_with_ref_arg(B &b);
> +
> +// CHECK-LABEL: define void @_Z22func_with_indirect_arg1A(%class.A
> addrspace(5)* %a)
> +// CHECK:  %p = alloca %class.A*, align 8, addrspace(5)
> +// CHECK:  %[[r1:.+]] = addrspacecast %class.A* addrspace(5)* %p to
> %class.A**
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %a to
> %class.A*
> +// CHECK:  store %class.A* %[[r0]], %class.A** %[[r1]], align 8
> +void func_with_indirect_arg(A a) {
> +  A *p = &a;
> +}
> +
> +// CHECK-LABEL: define void @_Z22test_indirect_arg_autov()
> +// CHECK:  %a = alloca %class.A, align 4, addrspace(5)
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %a to
> %class.A*
> +// CHECK:  %agg.tmp = alloca %class.A, align 4, addrspace(5)
> +// CHECK:  %[[r1:.+]] = addrspacecast %class.A addrspace(5)* %agg.tmp to
> %class.A*
> +// CHECK:  call void @_ZN1AC1Ev(%class.A* %[[r0]])
> +// CHECK:  call void @llvm.memcpy.p0i8.p0i8.i64
> +// CHECK:  %[[r4:.+]] = addrspacecast %class.A* %[[r1]] to %class.A
> addrspace(5)*
> +// CHECK:  call void @_Z22func_with_indirect_arg1A(%class.A
> addrspace(5)* %[[r4]])
> +// CHECK:  call void @_ZN1AD1Ev(%class.A* %[[r1]])
> +// CHECK:  call void @_Z17func_with_ref_argR1A(%class.A*
> dereferenceable(4) %[[r0]])
> +// CHECK:  call void @_ZN1AD1Ev(%class.A* %[[r0]])
> +void test_indirect_arg_auto() {
> +  A a;
> +  func_with_indirect_arg(a);
> +  func_with_ref_arg(a);
> +}
> +
> +// CHECK: define void @_Z24test_indirect_arg_globalv()
> +// CHECK:  %agg.tmp = alloca %class.A, align 4, addrspace(5)
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %agg.tmp to
> %class.A*
> +// CHECK:  call void @llvm.memcpy.p0i8.p0i8.i64
> +// CHECK:  %[[r2:.+]] = addrspacecast %class.A* %[[r0]] to %class.A
> addrspace(5)*
> +// CHECK:  call void @_Z22func_with_indirect_arg1A(%class.A
> addrspace(5)* %[[r2]])
> +// CHECK:  call void @_ZN1AD1Ev(%class.A* %[[r0]])
> +// CHECK:  call void @_Z17func_with_ref_argR1A(%class.A*
> dereferenceable(4) addrspacecast (%class.A addrspace(1)* @g_a to %class.A*))
> +void test_indirect_arg_global() {
> +  func_with_indirect_arg(g_a);
> +  func_with_ref_arg(g_a);
> +}
> +
> +// CHECK-LABEL: define void @_Z19func_with_byval_arg1B(%class.B
> addrspace(5)* byval align 4 %b)
> +// CHECK:  %p = alloca %class.B*, align 8, addrspace(5)
> +// CHECK:  %[[r1:.+]] = addrspacecast %class.B* addrspace(5)* %p to
> %class.B**
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %b to
> %class.B*
> +// CHECK:  store %class.B* %[[r0]], %class.B** %[[r1]], align 8
> +void func_with_byval_arg(B b) {
> +  B *p = &b;
> +}
> +
> +// CHECK-LABEL: define void @_Z19test_byval_arg_autov()
> +// CHECK:  %b = alloca %class.B, align 4, addrspace(5)
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %b to
> %class.B*
> +// CHECK:  %agg.tmp = alloca %class.B, align 4, addrspace(5)
> +// CHECK:  %[[r1:.+]] = addrspacecast %class.B addrspace(5)* %agg.tmp to
> %class.B*
> +// CHECK:  call void @llvm.memcpy.p0i8.p0i8.i64
> +// CHECK:  %[[r4:.+]] = addrspacecast %class.B* %[[r1]] to %class.B
> addrspace(5)*
> +// CHECK:  call void @_Z19func_with_byval_arg1B(%class.B addrspace(5)*
> byval align 4 %[[r4]])
> +// CHECK:  call void @_Z17func_with_ref_argR1B(%class.B*
> dereferenceable(400) %[[r0]])
> +void test_byval_arg_auto() {
> +  B b;
> +  func_with_byval_arg(b);
> +  func_with_ref_arg(b);
> +}
> +
> +// CHECK-LABEL: define void @_Z21test_byval_arg_globalv()
> +// CHECK:  %agg.tmp = alloca %class.B, align 4, addrspace(5)
> +// CHECK:  %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %agg.tmp to
> %class.B*
> +// CHECK:  call void @llvm.memcpy.p0i8.p0i8.i64
> +// CHECK:  %[[r2:.+]] = addrspacecast %class.B* %[[r0]] to %class.B
> addrspace(5)*
> +// CHECK:  call void @_Z19func_with_byval_arg1B(%class.B addrspace(5)*
> byval align 4 %[[r2]])
> +// CHECK:  call void @_Z17func_with_ref_argR1B(%class.B*
> dereferenceable(400) addrspacecast (%class.B addrspace(1)* @g_b to
> %class.B*))
> +void test_byval_arg_global() {
> +  func_with_byval_arg(g_b);
> +  func_with_ref_arg(g_b);
> +}
>
> Modified: cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/
> CodeGenOpenCL/addr-space-struct-arg.cl?rev=326946&r1=
> 326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl (original)
> +++ cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl Wed Mar  7
> 13:45:40 2018
> @@ -1,5 +1,6 @@
>  // RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header
> -ffake-address-space-map -triple i686-pc-darwin | FileCheck
> -enable-var-scope -check-prefixes=COM,X86 %s
> -// RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header
> -triple amdgcn-amdhsa-amd | FileCheck -enable-var-scope
> -check-prefixes=COM,AMDGCN %s
> +// RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header
> -triple amdgcn | FileCheck -enable-var-scope -check-prefixes=COM,AMDGCN %s
> +// RUN: %clang_cc1 %s -emit-llvm -o - -cl-std=CL2.0 -O0
> -finclude-default-header -triple amdgcn | FileCheck -enable-var-scope
> -check-prefixes=COM,AMDGCN,AMDGCN20 %s
>
>  typedef struct {
>    int cells[9];
> @@ -35,6 +36,9 @@ struct LargeStructTwoMember {
>    int2 y[20];
>  };
>
> +#if __OPENCL_C_VERSION__ >= 200
> +struct LargeStructOneMember g_s;
> +#endif
>
>  // X86-LABEL: define void @foo(%struct.Mat4X4* noalias sret %agg.result,
> %struct.Mat3X3* byval align 4 %in)
>  // AMDGCN-LABEL: define %struct.Mat4X4 @foo([9 x i32] %in.coerce)
> @@ -80,10 +84,42 @@ void FuncOneMember(struct StructOneMembe
>  }
>
>  // AMDGCN-LABEL: define void @FuncOneLargeMember(%struct.LargeStructOneMember
> addrspace(5)* byval align 8 %u)
> +// AMDGCN-NOT: addrspacecast
> +// AMDGCN:   store <2 x i32> %{{.*}}, <2 x i32> addrspace(5)*
>  void FuncOneLargeMember(struct LargeStructOneMember u) {
>    u.x[0] = (int2)(0, 0);
>  }
>
> +// AMDGCN20-LABEL: define void @test_indirect_arg_globl()
> +// AMDGCN20:  %[[byval_temp:.*]] = alloca %struct.LargeStructOneMember,
> align 8, addrspace(5)
> +// AMDGCN20:  %[[r0:.*]] = bitcast %struct.LargeStructOneMember
> addrspace(5)* %[[byval_temp]] to i8 addrspace(5)*
> +// AMDGCN20:  call void @llvm.memcpy.p5i8.p1i8.i64(i8 addrspace(5)* align
> 8 %[[r0]], i8 addrspace(1)* align 8 bitcast (%struct.LargeStructOneMember
> addrspace(1)* @g_s to i8 addrspace(1)*), i64 800, i1 false)
> +// AMDGCN20:  call void @FuncOneLargeMember(%struct.LargeStructOneMember
> addrspace(5)* byval align 8 %[[byval_temp]])
> +#if __OPENCL_C_VERSION__ >= 200
> +void test_indirect_arg_globl(void) {
> +  FuncOneLargeMember(g_s);
> +}
> +#endif
> +
> +// AMDGCN-LABEL: define amdgpu_kernel void @test_indirect_arg_local()
> +// AMDGCN: %[[byval_temp:.*]] = alloca %struct.LargeStructOneMember,
> align 8, addrspace(5)
> +// AMDGCN: %[[r0:.*]] = bitcast %struct.LargeStructOneMember
> addrspace(5)* %[[byval_temp]] to i8 addrspace(5)*
> +// AMDGCN: call void @llvm.memcpy.p5i8.p3i8.i64(i8 addrspace(5)* align 8
> %[[r0]], i8 addrspace(3)* align 8 bitcast (%struct.LargeStructOneMember
> addrspace(3)* @test_indirect_arg_local.l_s to i8 addrspace(3)*), i64 800,
> i1 false)
> +// AMDGCN: call void @FuncOneLargeMember(%struct.LargeStructOneMember
> addrspace(5)* byval align 8 %[[byval_temp]])
> +kernel void test_indirect_arg_local(void) {
> +  local struct LargeStructOneMember l_s;
> +  FuncOneLargeMember(l_s);
> +}
> +
> +// AMDGCN-LABEL: define void @test_indirect_arg_private()
> +// AMDGCN: %[[p_s:.*]] = alloca %struct.LargeStructOneMember, align 8,
> addrspace(5)
> +// AMDGCN-NOT: @llvm.memcpy
> +// AMDGCN-NEXT: call void @FuncOneLargeMember(%struct.LargeStructOneMember
> addrspace(5)* byval align 8 %[[p_s]])
> +void test_indirect_arg_private(void) {
> +  struct LargeStructOneMember p_s;
> +  FuncOneLargeMember(p_s);
> +}
> +
>  // AMDGCN-LABEL: define amdgpu_kernel void @KernelOneMember
>  // AMDGCN-SAME:  (<2 x i32> %[[u_coerce:.*]])
>  // AMDGCN:  %[[u:.*]] = alloca %struct.StructOneMember, align 8,
> addrspace(5)
> @@ -112,7 +148,6 @@ void FuncLargeTwoMember(struct LargeStru
>    u.y[0] = (int2)(0, 0);
>  }
>
> -
>  // AMDGCN-LABEL: define amdgpu_kernel void @KernelTwoMember
>  // AMDGCN-SAME:  (%struct.StructTwoMember %[[u_coerce:.*]])
>  // AMDGCN:  %[[u:.*]] = alloca %struct.StructTwoMember, align 8,
> addrspace(5)
>
> Modified: cfe/trunk/test/CodeGenOpenCL/byval.cl
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/
> CodeGenOpenCL/byval.cl?rev=326946&r1=326945&r2=326946&view=diff
> ============================================================
> ==================
> --- cfe/trunk/test/CodeGenOpenCL/byval.cl (original)
> +++ cfe/trunk/test/CodeGenOpenCL/byval.cl Wed Mar  7 13:45:40 2018
> @@ -1,5 +1,4 @@
>  // RUN: %clang_cc1 -emit-llvm -o - -triple amdgcn %s | FileCheck %s
> -// RUN: %clang_cc1 -emit-llvm -o - -triple amdgcn---opencl %s | FileCheck
> %s
>
>  struct A {
>    int x[100];
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20180309/0244a683/attachment-0001.html>


More information about the cfe-commits mailing list