[cfe-commits] [PATCH] llvm.annotation support

Julien Lerouge jlerouge at apple.com
Tue Jun 21 22:44:50 PDT 2011


Sorry for the late reply...

On Fri, Jun 17, 2011 at 04:57:22PM -0700, John McCall wrote:
> On Jun 14, 2011, at 12:14 AM, Julien Lerouge wrote:
> > Attached is a patch that brings back annotation support to the state it
> > was (or close) with llvm-gcc. The patch adds the __builtin_annotation
> > and support for annotating struct fields and local variables (support
> > for globals was already in there).
> > 
> > I moved all the code related to those annotations in CGAnnotations.h/cpp
> > The other changes are minor tweaks for the annotate attribute and a
> > bunch of tests.
> 
> As a general note, this looks pretty good;  I just have a few organizational
> and data-structure comments.
> 
> diff --git a/lib/CodeGen/CGAnnotations.cpp b/lib/CodeGen/CGAnnotations.cpp
> 
> +static const char *AnnotationSection = "llvm.metadata";
> 
> This should be a #define;  every single place you use this in the file ends up having to call strlen completely unnecessarily.
> 
> (it could be a const llvm::StringRef, except that would potentially leave a global ctor around, unfortunately)

Ok, as suggested, I'll go for the static const char AnnotationSection[].


> +llvm::Constant *CGAnnotations::EmitAnnotationString(llvm::StringRef Str) const {
> +  llvm::Module &M = CGM.getModule();
> +  llvm::LLVMContext &VMContext = CGM.getLLVMContext();
> +
> +  llvm::Constant *s = llvm::ConstantArray::get(VMContext, Str, true);
> +  for (llvm::Value::use_iterator i = s->use_begin(), e = s->use_end(); i != e;
> +    ++i) {
> +    llvm::GlobalVariable *gv = dyn_cast<llvm::GlobalVariable>(*i);
> +    if (!gv || gv->getSection() != AnnotationSection)
> +      continue;
> +    return gv;
> +  }
> +
> +  // Not found yet, create a new global.
> +  llvm::GlobalValue *gv = new llvm::GlobalVariable(M, s->getType(), true,
> +    llvm::GlobalValue::PrivateLinkage, s, ".str");
> +  gv->setSection(AnnotationSection);
> +  gv->setUnnamedAddr(true);
> +  return gv;
> +}
> 
> Building a ConstantArray from a string is very expensive.  Unless you
> expect annotations to almost never collide ? and since you generate
> these strings for each filename, that seems pretty unlikely ? or you
> need to make sure you unique them properly with user variables
> placed specifically in the annotations section, please use an
> llvm::StringMap<llvm::Constant*>.

Ok, I tried to avoid having some extra data structure to track
information that was already in the IR, but it certainly is much faster
to maintain another map.

> 
> +llvm::Value *CGAnnotations::EmitAnnotationCall(llvm::Value *AnnotationFn,
> +                                               llvm::Value *AnnotatedVal,
> +                                               llvm::StringRef AnnotationStr,
> +                                               SourceLocation Location,
> +                                               CGBuilderTy &Builder) const {
> +  llvm::Value *Args[4];
> +  Args[0] = AnnotatedVal;
> +  Args[1] = Builder.CreateBitCast(EmitAnnotationString(AnnotationStr),
> +                                  CGM.Int8PtrTy);
> +  Args[2] = Builder.CreateBitCast(EmitAnnotationUnit(Location), CGM.Int8PtrTy);
> +  Args[3] = EmitAnnotationLineNo(Location);
> +  return Builder.CreateCall(AnnotationFn, Args, Args + 4);
> +}
> 
> This should probably be a method of CodeGenFunction.

Ok.

> +llvm::Value *CGAnnotations::EmitBuiltinAnnotation(CodeGenFunction &CGF,
> +                                                  const CallExpr *E) const {
> 
> -> CodeGenFunction
> 
> +  CGBuilderTy &Builder = CGF.Builder;
> +  llvm::Value *AnnVal = CGF.EmitScalarExpr(E->getArg(0));
> +
> +  // Get the intrinsic.
> +  const llvm::Type *ArgTy = AnnVal->getType();
> +  llvm::Value *F = CGM.getIntrinsic(llvm::Intrinsic::annotation, &ArgTy, 1);
> +
> +  // Get the anntotation string.
> +  const Expr *AnnotationStrExpr = E->getArg(1);
> +  while (const CastExpr *CE = dyn_cast<CastExpr>(AnnotationStrExpr))
> +    AnnotationStrExpr = CE->getSubExpr();
> 
> A better way to do this is with E->getArg(1)->IgnoreParenCasts(), unless for
> some reason you really don't want to ignore parentheses.  However, the
> validation code for this in Sema rejects anything that isn't directly a
> StringLiteral, so it looks like either that should be made more lenient or
> this code can be made more strict.

String literal is just what I need, so it should be ok. Thanks.

> +  llvm::StringRef Str = cast<StringLiteral>(AnnotationStrExpr)->getString();
> +
> +  return EmitAnnotationCall(F, AnnVal, Str, E->getExprLoc(), Builder);
> +}
> 
> +void CGAnnotations::EmitVarAnnotations(const VarDecl *D, llvm::Value *V,
> +                                       CGBuilderTy &Builder) const {
> 
> -> CodeGenFunction

Ok.

> +  assert(D->hasAttr<AnnotateAttr>() && "no annotate attribute");
> +  for (ann_iterator ai = ann_begin(D), ae = ann_end(D); ai != ae; ++ai)
> +    EmitAnnotationCall(CGM.getIntrinsic(llvm::Intrinsic::var_annotation),
> +                       Builder.CreateBitCast(V, CGM.Int8PtrTy, V->getName()),
> 
> Hoist this bitcast out of the loop, please.

I'd rather keep this here for now, to keep the IR consistent with
what llvm-gcc was doing. The reason is that those annotations are meant
to be consumed by a mid level optimization pass, so I don't want to
break any existing code that might expect those casts (I know we have
such code around here unfortunately).

I can add a FIXME, and I'll get back to it once my project has finished
the transition to clang. Would that be acceptable ?

> 
> +                       (*ai)->getAnnotation(), D->getLocation(), Builder);
> +}
> 
> +llvm::Value *CGAnnotations::EmitFieldAnnotations(const FieldDecl *D,
> +                                                 llvm::Value *V,
> +                                                 CGBuilderTy &Builder) const {
> 
> -> CodeGenFunction

Ok.

> +  assert(D->hasAttr<AnnotateAttr>() && "no annotate attribute");
> +  const llvm::Type *ArgTy = CGM.Int8PtrTy, *VTy = V->getType();
> +  llvm::Value *F = CGM.getIntrinsic(llvm::Intrinsic::ptr_annotation, &ArgTy, 1);
> +
> +  for (ann_iterator ai = ann_begin(D), ae = ann_end(D); ai != ae; ++ai) {
> +    // Always emit the cast inst so we can differentiate between annotation on
> +    // the first field of a struct and annotation on the strut itself.
> 
> I guess this is how this has "traditionally" been represented, but there has
> got to be a better way of encoding that in the IR.

Yes. I will most likely spend more time to improve this later on. My
goal for now was to get clang to generate the same kind of IR llvm-gcc
was doing.

> +    if (V->getType() != ArgTy)
> +      V = Builder.Insert(new llvm::BitCastInst(V, ArgTy));
> 
> Please continue using CGF.Int8PtrTy here, it's much clearer.
> 
> diff --git a/lib/CodeGen/CGAnnotations.h b/lib/CodeGen/CGAnnotations.h
> 
> +class CGAnnotations {
> 
> I think I've basically asked you to take almost all the methods here to
> CodeGenFunction, so you might as well put the rest (and the data
> members) on CodeGenModule, which probably kills this entire header.
> I don't really care about avoiding the initialization of a couple of data
> structures.

Ok will do.

> 
> Otherwise, this all looks great, thanks!
> 
> John.

Thanks a lot. I'll submit a new patch in the coming days.

-- 
Julien Lerouge
PGP Key Id: 0xB1964A62
PGP Fingerprint: 392D 4BAD DB8B CE7F 4E5F FA3C 62DB 4AA7 B196 4A62
PGP Public Key from: keyserver.pgp.com



More information about the cfe-commits mailing list