[cfe-commits] [PATCH] atomic operation builtins, part 1
Jeffrey Yasskin
jyasskin at google.com
Fri Oct 7 15:58:41 PDT 2011
Oh, tests and additions to
http://clang.llvm.org/docs/LanguageExtensions.html#builtins would also
be appreciated.
On Fri, Oct 7, 2011 at 3:53 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:
> On Thu, Oct 6, 2011 at 6:36 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> Atomic ops; similar to the patch I submitted earlier, but this version
>> adds the restriction that the first operand to the __atomic_*
>> operations must be an _Atomic(T)*. I'm sending this to double-check
>> that I haven't missed any serious issues.
>>
>> After this is committed, I'll put together a patch for
>> __atomic_is_lock_free, including the necessary machinery to come up
>> with the right answers.
>>
>
> Comments/questions inline:
>
>> Index: include/clang/Basic/Builtins.def
>> ===================================================================
>> --- include/clang/Basic/Builtins.def (revision 141335)
>> +++ include/clang/Basic/Builtins.def (working copy)
>> @@ -585,8 +585,19 @@
>> BUILTIN(__sync_swap_8, "LLiLLiD*LLi.", "n")
>> BUILTIN(__sync_swap_16, "LLLiLLLiD*LLLi.", "n")
>>
>> +BUILTIN(__atomic_load, "v.", "t")
>> +BUILTIN(__atomic_store, "v.", "t")
>> +BUILTIN(__atomic_exchange, "v.", "t")
>> +BUILTIN(__atomic_compare_exchange_strong, "v.", "t")
>> +BUILTIN(__atomic_compare_exchange_weak, "v.", "t")
>> +BUILTIN(__atomic_fetch_add, "v.", "t")
>> +BUILTIN(__atomic_fetch_sub, "v.", "t")
>> +BUILTIN(__atomic_fetch_and, "v.", "t")
>> +BUILTIN(__atomic_fetch_or, "v.", "t")
>> +BUILTIN(__atomic_fetch_xor, "v.", "t")
>> +BUILTIN(__atomic_thread_fence, "vi", "t")
>> +BUILTIN(__atomic_signal_fence, "vi", "t")
>
> I believe gcc is using __sync_mem_* instead of __atomic_* for their
> builtins, although they're using __atomic_* for the library that
> handles maybe-not-lock-free calls:
> http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary. I've cc'ed Andrew
> MacLeod who's actually deciding this, in the hope that both compilers
> can use the same intrinsics.
>
> There's also some chance we'll eventually want to provide builtins to
> access the singlethread synchronization scope (or other
> synchronization scopes we haven't thought of a need for yet).
> (http://llvm.org/docs/LangRef.html#ordering for Andrew.) Do you want
> to include that parameter in these builtins, or wait, and maybe need
> to add another set?
>
>>
>> -
>> // Non-overloaded atomic builtins.
>> BUILTIN(__sync_synchronize, "v.", "n")
>> // GCC does not support these, they are a Clang extension.
>> Index: include/clang/Basic/DiagnosticSemaKinds.td
>> ===================================================================
>> --- include/clang/Basic/DiagnosticSemaKinds.td (revision 141335)
>> +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
>> @@ -4000,6 +4000,15 @@
>> def err_atomic_builtin_pointer_size : Error<
>> "first argument to atomic builtin must be a pointer to 1,2,4,8 or 16 byte "
>> "type (%0 invalid)">;
>> +def err_atomic_builtin_needs_atomic : Error<
>> + "first argument to atomic builtin must be a pointer to atomic "
>> + " type (%0 invalid)">;
>
> Maybe s/atomic/_Atomic/. Otherwise C++ programmers who see this are
> likely to try passing std::atomic<T> instances.
>
>> +def err_atomic_builtin_needs_atomic_int_or_ptr : Error<
>> + "first argument to atomic builtin must be a pointer to atomic "
>> + " integer or pointer (%0 invalid)">;
>> +def err_atomic_builtin_logical_needs_atomic_int : Error<
>> + "first argument to logical atomic builtin must be a pointer to atomic "
>> + " integer (%0 invalid)">;
>>
>> def err_deleted_function_use : Error<"attempt to use a deleted function">;
>>
>> Index: include/clang/AST/Expr.h
>> ===================================================================
>> --- include/clang/AST/Expr.h (revision 141335)
>> +++ include/clang/AST/Expr.h (working copy)
>> @@ -4157,6 +4157,128 @@
>> // Iterators
>> child_range children() { return child_range(&SrcExpr, &SrcExpr+1); }
>> };
>> +
>> +/// AtomicExpr - Variadic atomic builtins: __atomic_exchange, __atomic_fetch_*,
>> +/// __atomic_load, __atomic_store, and __atomic_compare_exchange_*, for the
>> +/// similarly-named C++0x instructions. All of these instructions take one
>> +/// primary pointer and at least one memory order.
>> +class AtomicExpr : public Expr {
>> +public:
>> + enum AtomicOp { Load, Store, CmpXchgStrong, CmpXchgWeak, Xchg,
>> + Add, Sub, And, Or, Xor };
>> +private:
>> + enum { PTR, ORDER, VAL1, ORDER_FAIL, VAL2, END_EXPR };
>> + Stmt* SubExprs[END_EXPR];
>> + unsigned NumSubExprs;
>> + SourceLocation BuiltinLoc, RParenLoc;
>> + AtomicOp Op;
>> +
>> +public:
>> + // Constructor for Load
>> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *order, QualType t,
>> + AtomicOp op, SourceLocation RP,
>> + bool TypeDependent, bool ValueDependent)
>> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary,
>> + TypeDependent, ValueDependent,
>> + ptr->isInstantiationDependent(),
>> + ptr->containsUnexpandedParameterPack()),
>> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) {
>> + SubExprs[PTR] = ptr;
>> + SubExprs[ORDER] = order;
>> + NumSubExprs = 2;
>> + }
>
> Hmm, normally I'd ask for a CreateLoad() static method that forwarded
> to an appropriate private constructor, since it may be hard to
> remember which constructor goes with which operation, but it looks
> like there's exactly one call site for each of these constructors, so
> it's probably not worth it. Maybe add an assert on 'op' instead?
>
>> +
>> + // Constructor for Store, Xchg, Add, Sub, And, Or, Xor
>> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *val, Expr *order,
>> + QualType t, AtomicOp op, SourceLocation RP,
>> + bool TypeDependent, bool ValueDependent)
>> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary,
>> + TypeDependent, ValueDependent,
>> + (ptr->isInstantiationDependent() ||
>> + val->isInstantiationDependent()),
>> + (ptr->containsUnexpandedParameterPack() ||
>> + val->containsUnexpandedParameterPack())),
>> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) {
>> + SubExprs[PTR] = ptr;
>> + SubExprs[ORDER] = order;
>> + SubExprs[VAL1] = val;
>> + NumSubExprs = 3;
>> + }
>> +
>> + // Constructor for CmpXchgStrong, CmpXchgWeak
>> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *val1, Expr *val2,
>> + Expr *order, Expr *order_fail, QualType t, AtomicOp op,
>> + SourceLocation RP, bool TypeDependent, bool ValueDependent)
>> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary,
>> + TypeDependent, ValueDependent,
>> + (ptr->isInstantiationDependent() ||
>> + val1->isInstantiationDependent() ||
>> + val2->isInstantiationDependent()),
>> + (ptr->containsUnexpandedParameterPack() ||
>> + val1->containsUnexpandedParameterPack() ||
>> + val2->containsUnexpandedParameterPack())),
>> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) {
>> + SubExprs[PTR] = ptr;
>> + SubExprs[VAL1] = val1;
>> + SubExprs[ORDER] = order;
>> + SubExprs[VAL2] = val2;
>> + SubExprs[ORDER_FAIL] = order_fail;
>> + NumSubExprs = 5;
>> + }
>> +
>> + /// \brief Build an empty __builtin_choose_expr.
>
> Is this a mis-copy? If not, I don't understand the comment.
>
>> + explicit AtomicExpr(EmptyShell Empty) : Expr(AtomicExprClass, Empty) { }
>> +
>> + Expr *getPtr() const { return cast<Expr>(SubExprs[PTR]); }
>> + void setPtr(Expr *E) { SubExprs[PTR] = E; }
>> + Expr *getOrder() const { return cast<Expr>(SubExprs[ORDER]); }
>> + void setOrder(Expr *E) { SubExprs[ORDER] = E; }
>> + Expr *getVal1() const { return cast<Expr>(SubExprs[VAL1]); }
>> + void setVal1(Expr *E) { SubExprs[VAL1] = E; }
>> + Expr *getOrderFail() const { return cast<Expr>(SubExprs[ORDER_FAIL]); }
>> + void setOrderFail(Expr *E) { SubExprs[ORDER_FAIL] = E; }
>> + Expr *getVal2() const { return cast<Expr>(SubExprs[VAL2]); }
>> + void setVal2(Expr *E) { SubExprs[VAL2] = E; }
>> +
>> + AtomicOp getOp() const { return Op; }
>> + void setOp(AtomicOp op) { Op = op; }
>> + unsigned getNumSubExprs() { return NumSubExprs; }
>> + void setNumSubExprs(unsigned num) { NumSubExprs = num; }
>> +
>> + int getOrderVal(ASTContext &Ctx) const {
>> + return getOrder()->EvaluateAsInt(Ctx).getZExtValue();
>> + }
>> + int getOrderFailVal(ASTContext &Ctx) const {
>
> Want to assert that this is a cmpxchg call?
>
>> + return getOrderFail()->EvaluateAsInt(Ctx).getZExtValue();
>> + }
>> + bool isVolatile() const {
>> + return getPtr()->getType()->getPointeeType().isVolatileQualified();
>> + }
>> +
>> + bool isCmpXChg() const {
>> + return getOp() == AtomicExpr::CmpXchgStrong ||
>> + getOp() == AtomicExpr::CmpXchgWeak;
>> + }
>> +
>> + SourceLocation getBuiltinLoc() const { return BuiltinLoc; }
>> + void setBuiltinLoc(SourceLocation L) { BuiltinLoc = L; }
>> +
>> + SourceLocation getRParenLoc() const { return RParenLoc; }
>> + void setRParenLoc(SourceLocation L) { RParenLoc = L; }
>> +
>> + SourceRange getSourceRange() const {
>> + return SourceRange(BuiltinLoc, RParenLoc);
>> + }
>> + static bool classof(const Stmt *T) {
>> + return T->getStmtClass() == AtomicExprClass;
>> + }
>> + static bool classof(const AtomicExpr *) { return true; }
>> +
>> + // Iterators
>> + child_range children() {
>> + return child_range(SubExprs, SubExprs+NumSubExprs);
>> + }
>> +};
>> } // end namespace clang
>>
>> #endif
>> Index: lib/Sema/SemaChecking.cpp
>> ===================================================================
>> --- lib/Sema/SemaChecking.cpp (revision 141335)
>> +++ lib/Sema/SemaChecking.cpp (working copy)
>> @@ -414,6 +437,141 @@
>> return false;
>> }
>>
>> +ExprResult
>> +Sema::SemaAtomicOpsOverloaded(ExprResult TheCallResult, AtomicExpr::AtomicOp Op) {
>> + CallExpr *TheCall = (CallExpr *)TheCallResult.get();
>
> Shouldn't you use cast<CallExpr>?
>
>> + DeclRefExpr *DRE =cast<DeclRefExpr>(TheCall->getCallee()->IgnoreParenCasts());
>> + Expr *Ptr, *Order, *Val1, *Val2, *OrderFail;
>> +
>> + unsigned NumVals = 1;
>> + unsigned NumOrders = 1;
>
> These probably deserve a comment about the layout of the builtin call. Like:
>
> // __atomic_call(_Atomic(T)* [, U]*NumVals [, int]*NumOrders);
> // where the 'U's are determined by the call and T, and the ints
> // represent memory_order parameters.
>
>> + if (Op == AtomicExpr::Load) {
>> + NumVals = 0;
>> + } else if (Op == AtomicExpr::CmpXchgWeak || Op == AtomicExpr::CmpXchgStrong) {
>> + NumVals = 2;
>> + NumOrders = 2;
>> + }
>> +
>> + if (TheCall->getNumArgs() < NumVals+NumOrders+1) {
>> + Diag(TheCall->getLocEnd(), diag::err_typecheck_call_too_few_args)
>> + << 0 << NumVals+NumOrders+1 << TheCall->getNumArgs()
>> + << TheCall->getCallee()->getSourceRange();
>> + return ExprError();
>> + } else if (TheCall->getNumArgs() > NumVals+NumOrders+1) {
>> + Diag(TheCall->getArg(2)->getLocStart(),
>
> Why "getArg(2)"? Should this be getArg(NumVals+NumOrders+1)?
>
>> + diag::err_typecheck_call_too_many_args)
>> + << 0 << NumVals+NumOrders+1 << TheCall->getNumArgs()
>> + << TheCall->getCallee()->getSourceRange();
>> + return ExprError();
>> + }
>> +
>> + // Inspect the first argument of the atomic builtin. This should always be
>> + // a pointer type, whose element is an integral scalar or pointer type.
>
> The pointer's element should be an atomic type whose element may be
> constrained by the operation, right?
>
>> + Ptr = TheCall->getArg(0);
>> + Ptr = DefaultFunctionArrayLvalueConversion(Ptr).get();
>> + const PointerType *pointerType = Ptr->getType()->getAs<PointerType>();
>> + if (!pointerType) {
>> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_must_be_pointer)
>> + << Ptr->getType() << Ptr->getSourceRange();
>> + return ExprError();
>> + }
>> +
>> + QualType AtomTy = pointerType->getPointeeType();
>> + if (!AtomTy->isAtomicType()) {
>> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_needs_atomic)
>> + << Ptr->getType() << Ptr->getSourceRange();
>> + return ExprError();
>> + }
>> + QualType ValType = cast<AtomicType>(AtomTy)->getValueType();
>> +
>> + if ((Op == AtomicExpr::Add || Op == AtomicExpr::Sub) &&
>> + !ValType->isIntegerType() && !ValType->isAnyPointerType() &&
>> + !ValType->isBlockPointerType()) {
>
> Your indentation's weird here.
>
>> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_needs_atomic_int_or_ptr)
>> + << Ptr->getType() << Ptr->getSourceRange();
>> + return ExprError();
>> + }
>> +
>> + if (!ValType->isIntegerType() &&
>> + (Op == AtomicExpr::And || Op == AtomicExpr::Or || Op == AtomicExpr::Xor)){
>> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_logical_needs_atomic_int)
>> + << Ptr->getType() << Ptr->getSourceRange();
>> + return ExprError();
>> + }
>> +
>> + switch (ValType.getObjCLifetime()) {
>> + case Qualifiers::OCL_None:
>> + case Qualifiers::OCL_ExplicitNone:
>> + // okay
>> + break;
>> +
>> + case Qualifiers::OCL_Weak:
>> + case Qualifiers::OCL_Strong:
>> + case Qualifiers::OCL_Autoreleasing:
>> + Diag(DRE->getLocStart(), diag::err_arc_atomic_ownership)
>> + << ValType << Ptr->getSourceRange();
>> + return ExprError();
>> + }
>> +
>> + QualType ResultType = ValType;
>> + if (Op == AtomicExpr::Store)
>> + ResultType = Context.VoidTy;
>> + else if (Op == AtomicExpr::CmpXchgWeak || Op == AtomicExpr::CmpXchgStrong)
>> + ResultType = Context.BoolTy;
>> +
>> + // The first argument --- the pointer --- has a fixed type; we
>> + // deduce the types of the rest of the arguments accordingly. Walk
>> + // the remaining arguments, converting them to the deduced value type.
>> + for (unsigned i = 0; i != NumVals+NumOrders; ++i) {
>> + ExprResult Arg = TheCall->getArg(i+1);
>> + QualType Ty;
>> + if (i < NumVals) {
>> + if (i == 0 && (Op == AtomicExpr::CmpXchgWeak ||
>
> It's a little confusing for 'i' to not represent the i'th parameter.
>
>> + Op == AtomicExpr::CmpXchgStrong))
>> + Ty = Context.getPointerType(ValType.getUnqualifiedType());
>> + else if (!ValType->isIntegerType() &&
>> + (Op == AtomicExpr::Add || Op == AtomicExpr::Sub))
>> + Ty = Context.getPointerDiffType();
>> + else
>> + Ty = ValType;
>> + } else {
>
> Could you add a comment like "// The remaining parameters are
> memory_orders, which are represented by ints."?
>
> Something should check the constraints on memory orders, like that
> cmpxchg's failure order has to be weaker than its success order, or
> that you can't have an acquire store. Is that this code, the <atomic>
> header, or codegen? You might also comment here that it's codegen
> that maps from consume to acquire.
>
>> + Ty = Context.IntTy;
>> + }
>> + InitializedEntity Entity =
>> + InitializedEntity::InitializeParameter(Context, Ty, false);
>> + Arg = PerformCopyInitialization(Entity, SourceLocation(), Arg);
>> + if (Arg.isInvalid())
>> + return true;
>> + TheCall->setArg(i+1, Arg.get());
>> + }
>> +
>> + if (Op == AtomicExpr::Load) {
>> + Order = TheCall->getArg(1);
>> + return Owned(new (Context) AtomicExpr(TheCall->getCallee()->getLocStart(),
>> + Ptr, Order, ResultType, Op,
>> + TheCall->getRParenLoc(), false,
>> + false));
>
> If AtomicExprs are never type- or value-dependent, why have those two
> parameters?
>
>> + } else if (Op != AtomicExpr::CmpXchgWeak && Op != AtomicExpr::CmpXchgStrong) {
>> + Val1 = TheCall->getArg(1);
>> + Order = TheCall->getArg(2);
>> + return Owned(new (Context) AtomicExpr(TheCall->getCallee()->getLocStart(),
>> + Ptr, Val1, Order, ResultType, Op,
>> + TheCall->getRParenLoc(), false,
>> + false));
>> + } else {
>> + Val1 = TheCall->getArg(1);
>> + Val2 = TheCall->getArg(2);
>> + Order = TheCall->getArg(3);
>> + OrderFail = TheCall->getArg(4);
>> + return Owned(new (Context) AtomicExpr(TheCall->getCallee()->getLocStart(),
>> + Ptr, Val1, Val2, Order, OrderFail,
>> + ResultType, Op,
>> + TheCall->getRParenLoc(), false,
>> + false));
>> + }
>> +}
>> +
>> +
>> /// checkBuiltinArgument - Given a call to a builtin function, perform
>> /// normal type-checking on the given argument, updating the call in
>> /// place. This is useful when a builtin function requires custom
>> Index: lib/CodeGen/CGExpr.cpp
>> ===================================================================
>> --- lib/CodeGen/CGExpr.cpp (revision 141335)
>> +++ lib/CodeGen/CGExpr.cpp (working copy)
>> @@ -2478,3 +2478,266 @@
>>
>> return MakeAddrLValue(AddV, MPT->getPointeeType());
>> }
>> +
>> +static void
>> +EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, llvm::Value *Dest,
>> + llvm::Value *Ptr, llvm::Value *Val1, llvm::Value *Val2,
>> + uint64_t Size, unsigned Align, llvm::AtomicOrdering Order) {
>> + if (E->isCmpXChg()) {
>> + // Note that cmpxchg only supports specifying one ordering and
>> + // doesn't support weak cmpxchg, at least at the moment.
>> + llvm::LoadInst *LoadVal1 = CGF.Builder.CreateLoad(Val1);
>> + LoadVal1->setAlignment(Align);
>> + llvm::LoadInst *LoadVal2 = CGF.Builder.CreateLoad(Val2);
>> + LoadVal2->setAlignment(Align);
>> + llvm::AtomicCmpXchgInst *CXI =
>> + CGF.Builder.CreateAtomicCmpXchg(Ptr, LoadVal1, LoadVal2, Order);
>> + CXI->setVolatile(E->isVolatile());
>> + llvm::StoreInst *StoreVal1 = CGF.Builder.CreateStore(CXI, Val1);
>> + StoreVal1->setAlignment(Align);
>> + llvm::Value *Cmp = CGF.Builder.CreateICmpEQ(CXI, LoadVal1);
>> + CGF.EmitStoreOfScalar(Cmp, CGF.MakeAddrLValue(Dest, E->getType()));
>> + return;
>> + }
>> +
>> + if (E->getOp() == AtomicExpr::Load) {
>> + llvm::LoadInst *Load = CGF.Builder.CreateLoad(Ptr);
>> + Load->setAtomic(Order);
>> + Load->setAlignment(Size);
>> + Load->setVolatile(E->isVolatile());
>> + llvm::StoreInst *StoreDest = CGF.Builder.CreateStore(Load, Dest);
>> + StoreDest->setAlignment(Align);
>> + return;
>> + }
>> +
>> + if (E->getOp() == AtomicExpr::Store) {
>> + assert(!Dest && "Store does not return a value");
>> + llvm::LoadInst *LoadVal1 = CGF.Builder.CreateLoad(Val1);
>> + LoadVal1->setAlignment(Align);
>> + llvm::StoreInst *Store = CGF.Builder.CreateStore(LoadVal1, Ptr);
>> + Store->setAtomic(Order);
>> + Store->setAlignment(Size);
>> + Store->setVolatile(E->isVolatile());
>> + return;
>> + }
>> +
>> + llvm::AtomicRMWInst::BinOp Op = llvm::AtomicRMWInst::Add;
>
> This is an odd default.
>
>> + switch (E->getOp()) {
>> + case AtomicExpr::CmpXchgWeak:
>> + case AtomicExpr::CmpXchgStrong:
>> + case AtomicExpr::Store:
>> + case AtomicExpr::Load: assert(0 && "Already handled!");
>> + case AtomicExpr::Add: Op = llvm::AtomicRMWInst::Add; break;
>> + case AtomicExpr::Sub: Op = llvm::AtomicRMWInst::Sub; break;
>> + case AtomicExpr::And: Op = llvm::AtomicRMWInst::And; break;
>> + case AtomicExpr::Or: Op = llvm::AtomicRMWInst::Or; break;
>> + case AtomicExpr::Xor: Op = llvm::AtomicRMWInst::Xor; break;
>> + case AtomicExpr::Xchg: Op = llvm::AtomicRMWInst::Xchg; break;
>> + }
>> + llvm::LoadInst *LoadVal1 = CGF.Builder.CreateLoad(Val1);
>> + LoadVal1->setAlignment(Align);
>> + llvm::AtomicRMWInst *RMWI =
>> + CGF.Builder.CreateAtomicRMW(Op, Ptr, LoadVal1, Order);
>> + RMWI->setVolatile(E->isVolatile());
>> + llvm::StoreInst *StoreDest = CGF.Builder.CreateStore(RMWI, Dest);
>> + StoreDest->setAlignment(Align);
>> +}
>> +
>> +static llvm::Value *
>> +EmitValToTemp(CodeGenFunction &CGF, Expr *E) {
>
> Please comment how this is different from EmitScalarExpr.
>
>> + llvm::Value *DeclPtr = CGF.CreateMemTemp(E->getType(), ".atomictmp");
>> + CGF.EmitAnyExprToMem(E, DeclPtr, E->getType().getQualifiers(),
>> + /*Init*/ true);
>> + return DeclPtr;
>> +}
>> +
>> +static RValue ConvertTempToRValue(CodeGenFunction &CGF, QualType Ty,
>> + llvm::Value *Dest) {
>> + if (Ty->isAnyComplexType())
>> + return RValue::getComplex(CGF.LoadComplexFromAddr(Dest, false));
>> + if (CGF.hasAggregateLLVMType(Ty))
>> + return RValue::getAggregate(Dest);
>> + return RValue::get(CGF.EmitLoadOfScalar(CGF.MakeAddrLValue(Dest, Ty)));
>> +}
>> +
>> +RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E, llvm::Value *Dest) {
>> + QualType PointeeTy = E->getPtr()->getType()->getPointeeType();
>> + CharUnits sizeChars = getContext().getTypeSizeInChars(PointeeTy);
>> + uint64_t Size = sizeChars.getQuantity();
>> + CharUnits alignChars = getContext().getTypeSizeInChars(PointeeTy);
>
> Why call this twice?
>
>> + unsigned Align = alignChars.getQuantity();
>> + // FIXME: Bound on Size should not be hardcoded.
>> + bool UseLibcall = (!llvm::isPowerOf2_64(Size) || Size > 8);
>> +
>> + llvm::Value *Ptr, *Order, *OrderFail = 0, *Val1 = 0, *Val2 = 0;
>> + Ptr = EmitScalarExpr(E->getPtr());
>> + Order = EmitScalarExpr(E->getOrder());
>> + if (E->isCmpXChg()) {
>> + Val1 = EmitScalarExpr(E->getVal1());
>> + Val2 = EmitValToTemp(*this, E->getVal2());
>> + OrderFail = EmitScalarExpr(E->getOrderFail());
>> + (void)OrderFail; // OrderFail is unused at the moment
>> + } else if (E->getOp() != AtomicExpr::Load) {
>> + Val1 = EmitValToTemp(*this, E->getVal1());
>> + }
>> +
>> + llvm::Type *PointeeLLVMTy =
>> + cast<llvm::PointerType>(Ptr->getType())->getElementType();
>> +
>> + llvm::Type *InstrReturnTy = PointeeLLVMTy;
>
> s/InstrReturnTy/CallReturnTy/ or something similar, since this is not
> the return type of the cmpxchg instructions.
>
>> + if (E->isCmpXChg())
>> + InstrReturnTy = Builder.getInt1Ty();
>> + else if (E->getOp() == AtomicExpr::Store)
>> + InstrReturnTy = 0;
>> +
>> + if (InstrReturnTy && !Dest)
>> + Dest = CreateMemTemp(E->getType(), ".atomicdst");
>> +
>> + if (UseLibcall) {
>> + // FIXME: Finalize what the libcalls are actually supposed to look like.
>
> Possibly link to http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary here.
>
>> + return EmitUnsupportedRValue(E, "atomic library call");
>> + }
>> +#if 0
>> + if (UseLibcall) {
>> + const char* LibCallName;
>> + switch (E->getOp()) {
>> + case AtomicExpr::CmpXchgWeak:
>> + LibCallName = "__atomic_compare_exchange_generic"; break;
>> + case AtomicExpr::CmpXchgStrong:
>> + LibCallName = "__atomic_compare_exchange_generic"; break;
>> + case AtomicExpr::Add: LibCallName = "__atomic_fetch_add_generic"; break;
>> + case AtomicExpr::Sub: LibCallName = "__atomic_fetch_sub_generic"; break;
>> + case AtomicExpr::And: LibCallName = "__atomic_fetch_and_generic"; break;
>> + case AtomicExpr::Or: LibCallName = "__atomic_fetch_or_generic"; break;
>> + case AtomicExpr::Xor: LibCallName = "__atomic_fetch_xor_generic"; break;
>> + case AtomicExpr::Xchg: LibCallName = "__atomic_exchange_generic"; break;
>> + case AtomicExpr::Store: LibCallName = "__atomic_store_generic"; break;
>> + case AtomicExpr::Load: LibCallName = "__atomic_load_generic"; break;
>> + }
>> + llvm::SmallVector<QualType, 4> Params;
>> + CallArgList Args;
>> + QualType RetTy = getContext().VoidTy;
>> + if (E->getOp() != AtomicExpr::Store && !E->isCmpXChg())
>> + Args.add(RValue::get(EmitCastToVoidPtr(Dest)),
>> + getContext().VoidPtrTy);
>> + Args.add(RValue::get(EmitCastToVoidPtr(Ptr)),
>> + getContext().VoidPtrTy);
>> + if (E->getOp() != AtomicExpr::Load)
>> + Args.add(RValue::get(EmitCastToVoidPtr(Val1)),
>> + getContext().VoidPtrTy);
>> + if (E->isCmpXChg()) {
>> + Args.add(RValue::get(EmitCastToVoidPtr(Val2)),
>> + getContext().VoidPtrTy);
>> + RetTy = getContext().IntTy;
>> + }
>> + Args.add(RValue::get(llvm::ConstantInt::get(SizeTy, Size)),
>> + getContext().getSizeType());
>> + const CGFunctionInfo &FuncInfo =
>> + CGM.getTypes().getFunctionInfo(RetTy, Args, FunctionType::ExtInfo());
>> + llvm::FunctionType *FTy = CGM.getTypes().GetFunctionType(FuncInfo, false);
>> + llvm::Constant *Func = CGM.CreateRuntimeFunction(FTy, LibCallName);
>> + RValue Res = EmitCall(FuncInfo, Func, ReturnValueSlot(), Args);
>> + if (E->isCmpXChg())
>> + return Res;
>> + if (E->getOp() == AtomicExpr::Store)
>> + return RValue::get(0);
>> + return ConvertTempToRValue(*this, E->getType(), Dest);
>> + }
>> +#endif
>> + llvm::Type *IPtrTy =
>> + llvm::IntegerType::get(getLLVMContext(), Size * 8)->getPointerTo();
>> + llvm::Value *OrigDest = Dest;
>> + Ptr = Builder.CreateBitCast(Ptr, IPtrTy);
>> + if (Val1) Val1 = Builder.CreateBitCast(Val1, IPtrTy);
>> + if (Val2) Val2 = Builder.CreateBitCast(Val2, IPtrTy);
>> + if (Dest && !E->isCmpXChg()) Dest = Builder.CreateBitCast(Dest, IPtrTy);
>> +
>> + if (isa<llvm::ConstantInt>(Order)) {
>> + int ord = cast<llvm::ConstantInt>(Order)->getZExtValue();
>> + switch (ord) {
>> + case 0: // memory_order_relaxed
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Monotonic);
>> + break;
>> + case 1: // memory_order_consume
>> + case 2: // memory_order_acquire
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Acquire);
>> + break;
>> + case 3: // memory_order_release
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Release);
>> + break;
>> + case 4: // memory_order_acq_rel
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::AcquireRelease);
>> + break;
>> + case 5: // memory_order_seq_cst
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::SequentiallyConsistent);
>> + break;
>> + default: // invalid order
>
> This should probably have triggered an error earlier in the frontend.
> If that's impossible, comment why?
>
>> + llvm::UndefValue::get(InstrReturnTy);
>> + }
>> + if (E->getOp() == AtomicExpr::Store)
>> + RValue::get(0);
>> + return ConvertTempToRValue(*this, E->getType(), OrigDest);
>> + }
>> +
>> + // Long case, when Order isn't obviously constant.
>> +
>> + // Create all the relevant BB's
>> + llvm::BasicBlock *MonotonicBB, *AcquireBB, *ReleaseBB, *AcqRelBB, *SeqCstBB;
>> + MonotonicBB = createBasicBlock("monotonic", CurFn);
>> + if (E->getOp() != AtomicExpr::Store)
>> + AcquireBB = createBasicBlock("acquire", CurFn);
>> + if (E->getOp() != AtomicExpr::Load)
>> + ReleaseBB = createBasicBlock("release", CurFn);
>> + if (E->getOp() != AtomicExpr::Load && E->getOp() != AtomicExpr::Store)
>> + AcqRelBB = createBasicBlock("acqrel", CurFn);
>> + SeqCstBB = createBasicBlock("seqcst", CurFn);
>> + llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn);
>> +
>> + // Create the switch and PHI for the split
>
> I don't see a PHI.
>
>> + Order = Builder.CreateIntCast(Order, Builder.getInt32Ty(), false);
>> + llvm::SwitchInst *SI = Builder.CreateSwitch(Order, MonotonicBB);
>
> Is it better to make 0 be the default, or is this just the simplest
> thing that works, and it's never likely to make a difference because
> nearly all uses will inline to a constant anyway? (Comment so the
> next person to come along doesn't have to ask that question.)
>
>> +
>> + // Emit all the different atomics
>> + Builder.SetInsertPoint(MonotonicBB);
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Monotonic);
>> + Builder.CreateBr(ContBB);
>> + if (E->getOp() != AtomicExpr::Store) {
>> + Builder.SetInsertPoint(AcquireBB);
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Acquire);
>> + Builder.CreateBr(ContBB);
>> + SI->addCase(Builder.getInt32(1), AcquireBB);
>
> I'd prefer symbolic values for the user-provided ordering numbers.
>
>> + SI->addCase(Builder.getInt32(2), AcquireBB);
>> + }
>> + if (E->getOp() != AtomicExpr::Load) {
>> + Builder.SetInsertPoint(ReleaseBB);
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::Release);
>> + Builder.CreateBr(ContBB);
>> + SI->addCase(Builder.getInt32(3), ReleaseBB);
>> + }
>> + if (E->getOp() != AtomicExpr::Load && E->getOp() != AtomicExpr::Store) {
>> + Builder.SetInsertPoint(AcqRelBB);
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::AcquireRelease);
>> + Builder.CreateBr(ContBB);
>> + SI->addCase(Builder.getInt32(4), AcqRelBB);
>> + }
>> + Builder.SetInsertPoint(SeqCstBB);
>> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align,
>> + llvm::SequentiallyConsistent);
>> + Builder.CreateBr(ContBB);
>> + SI->addCase(Builder.getInt32(5), SeqCstBB);
>> +
>> + // Cleanup and return
>> + Builder.SetInsertPoint(ContBB);
>> + if (E->getOp() == AtomicExpr::Store)
>> + RValue::get(0);
>> + return ConvertTempToRValue(*this, E->getType(), OrigDest);
>> +}
>
More information about the cfe-commits
mailing list