[llvm-commits] embedded metadata preview

Nick Lewycky nicholas at mxc.ca
Mon Mar 16 22:29:59 PDT 2009


Dan Gohman wrote:
> Hi Nick,
> 
> Here are a few review comments.
> 
>  > Index: lib/Bitcode/Writer/BitcodeWriter.cpp
>  > ===================================================================
>  > --- lib/Bitcode/Writer/BitcodeWriter.cpp        (revision 66999)
>  > +++ lib/Bitcode/Writer/BitcodeWriter.cpp        (working copy)
>  > @@ -677,6 +677,17 @@
>  >          Record.push_back(CE->getPredicate());
>  >          break;
>  >        }
>  > +    } else if (const MDString *S = dyn_cast<MDString>(C)) {
>  > +      Code = bitc::CST_CODE_MDSTRING;
>  > +      Record.push_back(S->length());
>  > +      for (unsigned i = 0, e = S->length(); i != e; ++i)
>  > +        Record.push_back(S->begin()[i]);
> 
> It seems pretty inefficient to put each char in a uint64_t.
> Forgive me for prematurely optimizing :-).

No that's fine, I just copied this out of the handling for strings in 
inline-asm under the assumption that it must be fine.

I tried adding "AbbrevToUse = String8Abbrev;" here but that causes a 
crash ("Invalid abbrev for record"). I'm not really familiar with this 
code, do you know how it's supposed to work? (If not I'm sure I can 
figure it out...)

> 
>  > Index: lib/Bitcode/Reader/BitcodeReader.cpp
>  > ===================================================================
>  > --- lib/Bitcode/Reader/BitcodeReader.cpp        (revision 66999)
>  > +++ lib/Bitcode/Reader/BitcodeReader.cpp        (working copy)
>  > @@ -283,10 +283,12 @@
>  >                                     UserCS->getType()->isPacked());
>  >        } else if (isa<ConstantVector>(UserC)) {
>  >          NewC = ConstantVector::get(&NewOps[0], NewOps.size());
>  > -      } else {
>  > -        // Must be a constant expression.
>  > +      } else if (isa<ConstantExpr>(UserC)) {
>  >          NewC = cast<ConstantExpr>(UserC)- 
>  >getWithOperands(&NewOps[0],
>  >                                                             
> NewOps.size());
>  > +      } else {
>  > +        // Must be a metadata node.
>  > +        NewC = MDNode::get(&NewOps[0], NewOps.size());
>  >        }
>  >
> 
> Can you add an assert(isa<MDNode>) here to verify the assertion in
> the comment?

Done.

> 
>  > Index: docs/LangRef.html
>  > ===================================================================
>  > --- docs/LangRef.html   (revision 66999)
>  > +++ docs/LangRef.html   (working copy)
> 
>  > @@ -1847,6 +1848,14 @@
>  >    large arrays) and is always exactly equivalent to using explicit  
> zero
>  >    initializers.
>  >    </dd>
>  > +
>  > +  <dt><b>Metadata node</b></dt>
>  > +
>  > +  <dd>A metadata node is a structure-like constant with the type  
> of an empty
>  > +  struct.  For example: "<tt>{ } !{ i32 0, { } !"test" }</tt>".  
> Unlike other
>  > +  constants that are meant to be interpreted as part of the  
> instruction stream,
>  > +  metadata is a place to attach additional information such as  
> debug info.
> 
> It would be helpful to mention that this is also expected to be used for
> encoding information that will be used by optimizations.

I'm actually thinking of removing any mention of them here and only 
talking about them under the "Embedded Metadata" section. Thoughts?

I added this as the last paragraph in that section:
   "Optimizations may rely on metadata to provide additional information 
about the program that isn't available in the instructions, or that 
isn't easily computable. Similarly, the code generator may expect a 
certain metadata format to be used to express debugging information."

Close enough? :)

> MDString and MDNode have protected constructors.  Do you anticipate
> these classes having subclasses? If so, what would they be like?

Nope. Again, this is just copied from other classes I see around.

> Why do MDString and MDNode inherit from Constant instead of, say,  
> User? I
> see that ConstantLastVal is moved to include MDStringVal and MDNodeVal,
> presumably for the same reason.

I tried that initially, where MDNode was a User and MDString was a 
Value. In practise, they seem to have semantics much closer to that of 
Constants; they're uniqued / pointer comparable and they're global. The 
point where it really made a difference was in the .ll parser, switching 
them over to being Constant made the parser changes much simpler.

Nick

> Dan
> 
> On Mar 14, 2009, at 1:45 AM, Nick Lewycky wrote:
> 
>> I took a stab at implementing embedded metadata. The idea is to  
>> allow LLVM's user to attach meaningful data to the instruction  
>> stream that isn't really relevant to the execution of the  
>> instructions. This comes from Chris' notes:
>>
>>  http://nondot.org/sabre/LLVMNotes/EmbeddedMetadata.txt
>>
>> which I deviated a little. The major thing missing from this patch  
>> is any verifier support. My current plan is to add support to the  
>> verifier for ensuring that MDNode and MDString are only used from a)  
>> initializers to globals named starting with "llvm." b) calls to  
>> intrinsic functions, then commit it.
>>
>> The attached patch implements MDString and MDNode as new types of  
>> Constant, .ll/.bc reader/writer support and a LangRef update. I'd  
>> appreciate any review comments!
>>
>> Nick
>> <emb-md.patch>_______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 




More information about the llvm-commits mailing list