[cfe-commits] [PATCH] llvm.annotation support
Julien Lerouge
jlerouge at apple.com
Tue Jun 21 22:44:50 PDT 2011
Sorry for the late reply...
On Fri, Jun 17, 2011 at 04:57:22PM -0700, John McCall wrote:
> On Jun 14, 2011, at 12:14 AM, Julien Lerouge wrote:
> > Attached is a patch that brings back annotation support to the state it
> > was (or close) with llvm-gcc. The patch adds the __builtin_annotation
> > and support for annotating struct fields and local variables (support
> > for globals was already in there).
> >
> > I moved all the code related to those annotations in CGAnnotations.h/cpp
> > The other changes are minor tweaks for the annotate attribute and a
> > bunch of tests.
>
> As a general note, this looks pretty good; I just have a few organizational
> and data-structure comments.
>
> diff --git a/lib/CodeGen/CGAnnotations.cpp b/lib/CodeGen/CGAnnotations.cpp
>
> +static const char *AnnotationSection = "llvm.metadata";
>
> This should be a #define; every single place you use this in the file ends up having to call strlen completely unnecessarily.
>
> (it could be a const llvm::StringRef, except that would potentially leave a global ctor around, unfortunately)
Ok, as suggested, I'll go for the static const char AnnotationSection[].
> +llvm::Constant *CGAnnotations::EmitAnnotationString(llvm::StringRef Str) const {
> + llvm::Module &M = CGM.getModule();
> + llvm::LLVMContext &VMContext = CGM.getLLVMContext();
> +
> + llvm::Constant *s = llvm::ConstantArray::get(VMContext, Str, true);
> + for (llvm::Value::use_iterator i = s->use_begin(), e = s->use_end(); i != e;
> + ++i) {
> + llvm::GlobalVariable *gv = dyn_cast<llvm::GlobalVariable>(*i);
> + if (!gv || gv->getSection() != AnnotationSection)
> + continue;
> + return gv;
> + }
> +
> + // Not found yet, create a new global.
> + llvm::GlobalValue *gv = new llvm::GlobalVariable(M, s->getType(), true,
> + llvm::GlobalValue::PrivateLinkage, s, ".str");
> + gv->setSection(AnnotationSection);
> + gv->setUnnamedAddr(true);
> + return gv;
> +}
>
> Building a ConstantArray from a string is very expensive. Unless you
> expect annotations to almost never collide ? and since you generate
> these strings for each filename, that seems pretty unlikely ? or you
> need to make sure you unique them properly with user variables
> placed specifically in the annotations section, please use an
> llvm::StringMap<llvm::Constant*>.
Ok, I tried to avoid having some extra data structure to track
information that was already in the IR, but it certainly is much faster
to maintain another map.
>
> +llvm::Value *CGAnnotations::EmitAnnotationCall(llvm::Value *AnnotationFn,
> + llvm::Value *AnnotatedVal,
> + llvm::StringRef AnnotationStr,
> + SourceLocation Location,
> + CGBuilderTy &Builder) const {
> + llvm::Value *Args[4];
> + Args[0] = AnnotatedVal;
> + Args[1] = Builder.CreateBitCast(EmitAnnotationString(AnnotationStr),
> + CGM.Int8PtrTy);
> + Args[2] = Builder.CreateBitCast(EmitAnnotationUnit(Location), CGM.Int8PtrTy);
> + Args[3] = EmitAnnotationLineNo(Location);
> + return Builder.CreateCall(AnnotationFn, Args, Args + 4);
> +}
>
> This should probably be a method of CodeGenFunction.
Ok.
> +llvm::Value *CGAnnotations::EmitBuiltinAnnotation(CodeGenFunction &CGF,
> + const CallExpr *E) const {
>
> -> CodeGenFunction
>
> + CGBuilderTy &Builder = CGF.Builder;
> + llvm::Value *AnnVal = CGF.EmitScalarExpr(E->getArg(0));
> +
> + // Get the intrinsic.
> + const llvm::Type *ArgTy = AnnVal->getType();
> + llvm::Value *F = CGM.getIntrinsic(llvm::Intrinsic::annotation, &ArgTy, 1);
> +
> + // Get the anntotation string.
> + const Expr *AnnotationStrExpr = E->getArg(1);
> + while (const CastExpr *CE = dyn_cast<CastExpr>(AnnotationStrExpr))
> + AnnotationStrExpr = CE->getSubExpr();
>
> A better way to do this is with E->getArg(1)->IgnoreParenCasts(), unless for
> some reason you really don't want to ignore parentheses. However, the
> validation code for this in Sema rejects anything that isn't directly a
> StringLiteral, so it looks like either that should be made more lenient or
> this code can be made more strict.
String literal is just what I need, so it should be ok. Thanks.
> + llvm::StringRef Str = cast<StringLiteral>(AnnotationStrExpr)->getString();
> +
> + return EmitAnnotationCall(F, AnnVal, Str, E->getExprLoc(), Builder);
> +}
>
> +void CGAnnotations::EmitVarAnnotations(const VarDecl *D, llvm::Value *V,
> + CGBuilderTy &Builder) const {
>
> -> CodeGenFunction
Ok.
> + assert(D->hasAttr<AnnotateAttr>() && "no annotate attribute");
> + for (ann_iterator ai = ann_begin(D), ae = ann_end(D); ai != ae; ++ai)
> + EmitAnnotationCall(CGM.getIntrinsic(llvm::Intrinsic::var_annotation),
> + Builder.CreateBitCast(V, CGM.Int8PtrTy, V->getName()),
>
> Hoist this bitcast out of the loop, please.
I'd rather keep this here for now, to keep the IR consistent with
what llvm-gcc was doing. The reason is that those annotations are meant
to be consumed by a mid level optimization pass, so I don't want to
break any existing code that might expect those casts (I know we have
such code around here unfortunately).
I can add a FIXME, and I'll get back to it once my project has finished
the transition to clang. Would that be acceptable ?
>
> + (*ai)->getAnnotation(), D->getLocation(), Builder);
> +}
>
> +llvm::Value *CGAnnotations::EmitFieldAnnotations(const FieldDecl *D,
> + llvm::Value *V,
> + CGBuilderTy &Builder) const {
>
> -> CodeGenFunction
Ok.
> + assert(D->hasAttr<AnnotateAttr>() && "no annotate attribute");
> + const llvm::Type *ArgTy = CGM.Int8PtrTy, *VTy = V->getType();
> + llvm::Value *F = CGM.getIntrinsic(llvm::Intrinsic::ptr_annotation, &ArgTy, 1);
> +
> + for (ann_iterator ai = ann_begin(D), ae = ann_end(D); ai != ae; ++ai) {
> + // Always emit the cast inst so we can differentiate between annotation on
> + // the first field of a struct and annotation on the strut itself.
>
> I guess this is how this has "traditionally" been represented, but there has
> got to be a better way of encoding that in the IR.
Yes. I will most likely spend more time to improve this later on. My
goal for now was to get clang to generate the same kind of IR llvm-gcc
was doing.
> + if (V->getType() != ArgTy)
> + V = Builder.Insert(new llvm::BitCastInst(V, ArgTy));
>
> Please continue using CGF.Int8PtrTy here, it's much clearer.
>
> diff --git a/lib/CodeGen/CGAnnotations.h b/lib/CodeGen/CGAnnotations.h
>
> +class CGAnnotations {
>
> I think I've basically asked you to take almost all the methods here to
> CodeGenFunction, so you might as well put the rest (and the data
> members) on CodeGenModule, which probably kills this entire header.
> I don't really care about avoiding the initialization of a couple of data
> structures.
Ok will do.
>
> Otherwise, this all looks great, thanks!
>
> John.
Thanks a lot. I'll submit a new patch in the coming days.
--
Julien Lerouge
PGP Key Id: 0xB1964A62
PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62
PGP Public Key from: keyserver.pgp.com
More information about the cfe-commits
mailing list