[llvm-commits] metadata.h/cpp review stuff

Devang Patel devang.patel at gmail.com
Thu Oct 22 12:53:07 PDT 2009


Chris,

Thanks for detailed review. I have covered most of the points you
brought up, except few.

On Mon, Oct 19, 2009 at 12:06 AM, Chris Lattner <clattner at apple.com> wrote:
> Hi Devang,
>
> I'm just now getting caught up on your work.  Here are some thoughts/
> questions on Metadata.h/cpp:
>
>
> The metadata hierarchy is:
>
> User
>   -> MetadataBase
>      -> MDString
>      -> MDNode
>      -> NamedMDNode
>
> Why not:
>
> Value
>   -> MetadataBase
>      -> MDString
>      -> MDNode
>      -> NamedMDNode
>
> This would shrink them significantly and reduce the games you have to
> play with hiding getNumOperands, overloading operator new, etc.  Also,
> MetadataBase doesn't have much common state, it isn't clear it needs
> to exist at all.

Done.

> There is some seriously confused code circling around
> MetadataBase::resizeOperands.  It looks like half the code thinks that
> operands can happen, but half doesn't.  Please remove all this stuff.

This code disappeared once I fixed class hierarchy.

>
> In Metadata.h:
>
>
> #include "llvm/User.h"
> #include "llvm/Type.h"
> #include "llvm/OperandTraits.h"
> #include "llvm/ADT/FoldingSet.h"
> #include "llvm/ADT/SmallVector.h"
> #include "llvm/ADT/SmallPtrSet.h"
> #include "llvm/ADT/DenseMap.h"
> #include "llvm/ADT/StringMap.h"
> #include "llvm/ADT/ilist_node.h"
> #include "llvm/Support/ErrorHandling.h"
> #include "llvm/Support/ValueHandle.h"
>
> This is including way too much.  Please reduce this as aggressively as
> possible.

Done.

> class MDString : public MetadataBase {
> ...
>   explicit MDString(LLVMContext &C, const char *begin, unsigned l)
>
> This ctor should just take a StringRef.

Done.

>
> ...
>   static MDString *get(LLVMContext &Context, const StringRef &Str);
>
> Please take StringRef's by value, not by const&.

Done.

> class MDNode : public MetadataBase, public FoldingSetNode {
> ...
>   friend class ElementVH;
>   // Use CallbackVH to hold MDNOde elements.
>   struct ElementVH : public CallbackVH {
>     MDNode *Parent;
>     ElementVH(Value *V, MDNode *P) : CallbackVH(V), Parent(P) {}
>     ~ElementVH() {}
>
>     virtual void deleted() {
>       Parent->replaceElement(this->operator Value*(), 0);
>     }
>
>     virtual void allUsesReplacedWith(Value *NV) {
>       Parent->replaceElement(this->operator Value*(), NV);
>     }
>   };
>   // Replace each instance of F from the element list of this node
> with T.
>   void replaceElement(Value *F, Value *T);
>
>   SmallVector<ElementVH, 4> Node;
>
>
> Please move all of this stuff out of line to Metadata.cpp.  If needbe,
> Node should be declared as a 'void*' in the header.  Also, there is no
> reason to use a SmallVector here: the size is fixed on node
> construction.  Just use a pointer to array and new[]/delete[] the
> array.  Keeping a SmallVector in NamedMDNode (but not MDNode) is
> completely fine.

Now, node is a pointer instead of SmallVector.

>
> Why does MDNode::dropAllReferences need to exist?  It seems *highly*
> unsafe because it clears 'Node' but doesn't update the FoldingSet.

Now dropAllReferences() does not exist. Earlier it was invoked after
FoldingSet was updated.

>
>   Value *getElement(unsigned i) const {
>     assert(getNumElements() > i && "Invalid element number!");
>
> This is fine, but please use "i < getNumElements()" because that is
> more idiomatic.  This happens multiple times in the file.

Done.

>
> /// WeakMetadataVH - a weak value handle for metadata.
> class WeakMetadataVH : public WeakVH {
> public:
>
> I think this should be removed and the use replaced with TrackingVH??

Done.
>
> NamedMDNode::setParent shouldn't be public.  Does
> NamedMDNode::dropAllReferences need to exist?
>
> I don't understand NamedMDNode::Create(const NamedMDNode *NMD, Module
> *M).  Is this a weird copy ctor?  Why does it need to exist?

It is used while linking modules.

>
>   void addElement(MetadataBase *M) {
>     resizeOperands(0);
>     OperandList[NumOperands++] = M;
>     Node.push_back(WeakMetadataVH(M));
>   }
>
> How can this work?  Isn't this an invalid access into OperandList?

resizeOperands() did some magic to make this work:) Now it is gone.

>
> /// MetadataContext -
> /// MetadataContext handles uniquing and assignment of IDs for custom
> metadata
> /// types. Custom metadata handler names do not contain spaces. And
> the name
> /// must start with an alphabet. The regular expression used to check
> name
>
> "start with an alphabet" is missing something.
>
> class MetadataContext {
> public:
>   typedef std::pair<unsigned, WeakVH> MDPairTy;
>   typedef SmallVector<MDPairTy, 2> MDMapTy;
>   typedef DenseMap<const Instruction *, MDMapTy> MDStoreTy;
>   friend class BitcodeReader;
> private:
>
>   /// MetadataStore - Collection of metadata used in this context.
>   MDStoreTy MetadataStore;
>
>   /// MDHandlerNames - Map to hold metadata handler names.
>   StringMap<unsigned> MDHandlerNames;
>
>
> These datastructures are all implementation details.  They should be
> hidden behind a void*, allowing you to avoid #include of DenseMap.h,
> StringMap.h etc.  MDPairTy should be replaced with TrackingVH<MDNode>.

These data structures are now hiding inside Metadata.cpp

>
>   /// RegisterMDKind - Register a new metadata kind and return its ID.
>   /// A metadata kind can be registered only once.
>   unsigned RegisterMDKind(const char *Name);
>
>   /// getMDKind - Return metadata kind. If the requested metadata kind
>   /// is not registered then return 0.
>   unsigned getMDKind(const char *Name);
>
>   /// validName - Return true if Name is a valid custom metadata
> handler name.
>   bool validName(const char *Name);
>
> These methods should all take StringRef for consistency.  The second
> should be const, the third should be static.  validName ->
> isValidName.  RegisterMDKind -> registerMDKind.

Done.

>   /// getMD - Get the metadata of given kind attached with an
> Instruction.
>   /// If the metadata is not found then return 0.
>   MDNode *getMD(unsigned Kind, const Instruction *Inst);
>
> "with an instruction" -> "to an instruction".   This occurs several
> times in both files.

Fixed.

>   /// removeMDs - Remove all metadata attached with an instruction.
>   void removeMDs(const Instruction *Inst);
>
> removeAllMetadata()?

ok, renamed.

>
>   /// getHandlerNames - Get handler names. This is used by bitcode
>   /// writer.
>   const StringMap<unsigned> *getHandlerNames();
>
> This is gross for several reasons: at the least, it should be const.
> It also exposes implementation details.  Finally, the bitcode writer
> is currently writing these out in iterator order which is not at all
> deterministic.  It would be better to have something like:
>
>   void getHandlerNames(SmallVectorImpl<std::pair<unsigned, StringRef>
>  > &) const;
>
> and have the implementation of getHandlerNames do the sort (with
> array_pod_sort).

Fixed.

>
>   /// ValueIsDeleted - This handler is used to update metadata store
>   /// when a value is deleted.
>   void ValueIsDeleted(const Value *) {}
>   void ValueIsDeleted(const Instruction *Inst) {
>     removeMDs(Inst);
>   }
>   void ValueIsRAUWd(Value *V1, Value *V2);
>
>   /// ValueIsCloned - This handler is used to update metadata store
>   /// when In1 is cloned to create In2.
>   void ValueIsCloned(const Instruction *In1, Instruction *In2);
>
> These should not be public and/or should not be in the MetadataContext
> class.

I am not sure why ?

>
> Metadata.cpp:
>
> void MDNode::replaceElement(Value *From, Value *To) {
> ...
>
> You don't know that MDNode::replaceElement will be called before or
> after the VH operands uses are updated.

I do. This is a private method used by VH handlers only.

>  I don't think this method in
> general is safe because a bunch of stuff depends on pointer equality
> with the "from" value.
>
>
>   // MDNode only lists metadata elements in operand list, because
> MDNode
>   // used by MDNode is considered a valid use. However on the side,
> MDNode
>   // using a non-metadata value is not considered a "use" of non-
> metadata
>   // value.
>
> Why do MD uses try to show up in the operand list?  This seems really
> complex, what benefit does it provide?

Now MD uses do not show up in the operand list :)

>
>   if (MetadataBase *MDTo = dyn_cast_or_null<MetadataBase>(To)) {
>     for (SmallVector<unsigned, 4>::iterator OI = OpIndexes.begin(),
>            OE = OpIndexes.end(); OI != OE; ++OI)
>       setOperand(*OI, MDTo);
>   } else {
>     for (SmallVector<unsigned, 4>::iterator OI = OpIndexes.begin(),
>            OE = OpIndexes.end(); OI != OE; ++OI)
>       setOperand(*OI, 0);
>   }
>
> This doesn't make sense to me.  If one MD operand is replaced, it sets
> them all to the new thing?

This code is gone now.

>
> NamedMDNode::NamedMDNode(LLVMContext &C, const Twine &N,
> ...
>   resizeOperands(NumMDs);
>
>   for (unsigned i = 0; i != NumMDs; ++i) {
>     if (MDs[i])
>       OperandList[NumOperands++] = MDs[i];
>     Node.push_back(WeakMetadataVH(MDs[i]));
>   }
>
> This is wrong unless resizeOperands does something really strange
> (likely).

It did :)

>  Why does NamedMDNode put anything in the 'User' operands
> list at all?  I think this should just be removed.

This code is gone, because now metadata is not derived from User.

>
> unsigned MetadataContext::RegisterMDKind(const char *Name) {
> ..
>   assert(MDHandlerNames.find(Name) == MDHandlerNames.end()
>          && "Already registered MDKind!");
>
> Just use: assert(MDHandlerNames.count(Name) == 0 && ...

Done.

>
>
>   MDHandlerNames[Name] = Count + 1;
>   return Count + 1;
>
> How about: return MDHandlerNames[Name] = Count + 1;

Done.

>
> unsigned MetadataContext::getMDKind(const char *Name) {
>   assert(validName(Name) && "Invalid custome metadata name!");
>   StringMap<unsigned>::iterator I = MDHandlerNames.find(Name);
>
> Random microoptimization for assert builds: the check for a valid name
> can be sunk into the case where the query fails.  If the query
> succeeds (common case) the name was already validated.
>

Done.

> void MetadataContext::addMD(unsigned MDKind, MDNode *Node, Instruction
> *Inst) {
> ...
>   MDStoreTy::iterator I = MetadataStore.find(Inst);
>   if (I == MetadataStore.end()) {
>     MDMapTy Info;
>     Info.push_back(std::make_pair(MDKind, Node));
>     MetadataStore.insert(std::make_pair(Inst, Info));
>     return;
>   }
>
> It is actually better to just rely on default construction in the
> map.  Just use:
>   MDMapTy &Info = MetadataStore[Inst];
> and delete the code above.

Done.

>
> /// removeMD - Remove metadata of given kind attached with an
> instuction.
> void MetadataContext::removeMD(unsigned Kind, Instruction *Inst) {
>   MDStoreTy::iterator I = MetadataStore.find(Inst);
>   if (I == MetadataStore.end())
>     return;
>   MDMapTy &Info = I->second;
>
> This seems wrong.  The method should only be called if the
> 'HasMetadata' bit is set on the instruction.  If it is set and there
> is no entry in MetaDataStore, then it is a logic bug (so it should
> assert).  Just do:

This is intentional, can be used by a pass to strip all debug info
where each instruction may not have debug info attached to itself.

> MDMapTy &Info = MetadataStore[Inst];
> assert(!Info.empty() && ..
>
> or whatever.
>
>
>
> /// removeMDs - Remove all metadata attached with an instruction.
> void MetadataContext::removeMDs(const Instruction *Inst) {
>   // Find Metadata handles for this instruction.
>   MDStoreTy::iterator I = MetadataStore.find(Inst);
>   assert(I != MetadataStore.end() && "Invalid custom metadata info!");
>   MDMapTy &Info = I->second;
>
>   // FIXME : Give all metadata handlers a chance to adjust.
>
>   // Remove the entries for this instruction.
>   Info.clear();
>   MetadataStore.erase(I);
> }
>
>
> This doesn't look fully implemented.   In addition to the FIXME (What
> does it mean?) this should clear the  HasMetadata bit on the
> instruction.  Also, don't use find here.  This entire method can
> currently be replaced with this, which is identical in behavior:
>
> void MetadataContext::removeMDs(const Instruction *Inst) {
>   MetadataStore.erase(Inst);
> }

Done.

>
> void MetadataContext::copyMD(Instruction *In1, Instruction *In2) {
>   assert(In1 && In2 && "Invalid instruction!");
>    MDStoreTy::iterator I = MetadataStore.find(In1);
>   if (I == MetadataStore.end())
>     return;
>
>   MDMapTy &In1Info = I->second;
>   MDMapTy In2Info;
>   for (MDMapTy::iterator I = In1Info.begin(), E = In1Info.end(); I !=
> E; ++I)
>     if (MDNode *MD = dyn_cast_or_null<MDNode>(I->second))
>       addMD(I->first, MD, In2);
>
>
> Similar comments to above.  Also, In2Info is dead here.
> dyn_cast_or_null should go away when you start using trackingvh
> instead of weakvh (not just here, check the rest of the file).

Done.

>
>
> MDNode *MetadataContext::getMD(unsigned MDKind, const Instruction
> *Inst) {
>   MDStoreTy::iterator I = MetadataStore.find(Inst);
>   if (I == MetadataStore.end())
>     return NULL;
>
> Should be an assert as above.

It is used by clients to see if metadata of given kind is attached to
this inst or not.

>
> const MetadataContext::MDMapTy *
> MetadataContext::getMDs(const Instruction *Inst) {
>
> I really don't like this method.  Exposing implementation details like
> this is bad.  IF this is for the bc writer, please follow something
> similar to the getHandlerNames approach above.

Fixed.

>  void MetadataContext::ValueIsCloned(const Instruction *In1,
> Instruction *In2) {
>   // Find Metadata handles for In1.
>   MDStoreTy::iterator I = MetadataStore.find(In1);
>   assert(I != MetadataStore.end() && "Invalid custom metadata info!");
>
>   // FIXME : Give all metadata handlers a chance to adjust.
>
>   MDMapTy &In1Info = I->second;
>   MDMapTy In2Info;
>   for (MDMapTy::iterator I = In1Info.begin(), E = In1Info.end(); I !=
> E; ++I)
>     if (MDNode *MD = dyn_cast_or_null<MDNode>(I->second))
>       addMD(I->first, MD, In2);
> }
>
> How is this different than copyMD?

ValueIsCloned() is used by VH handlers and it has assertion checks.
copyMD() is available to clients to copy metadata, if available, from
one instruction to another instruction.

> /// ValueIsRAUWd - This handler is used when V1's all uses are
> replaced by
> /// V2.
> void MetadataContext::ValueIsRAUWd(Value *V1, Value *V2) {
>   Instruction *I1 = dyn_cast<Instruction>(V1);
>   Instruction *I2 = dyn_cast<Instruction>(V2);
>   if (!I1 || !I2)
>
> This seems very very wrong.  If I do:
>
> X->RAUW(Y);
>
> Then the properties on X are not necessarily true in Y's scope.  The
> debug info (for example) should just be discarded.

I am not sure. Even if an instruction is replaced by another
instruction the new instruction still represents original source code
location. Otherwise, IMO, this will put huge burden on optimizer to
keep dbg info available through all RAUW.

> Value range info
> is another example: X may be defined lower in the dom tree where some
> property holds that doesn't hold on Y.

That is true for this use. That's why the comment
"FIXME: Give custom handlers a chance to override this".

Please let me know if I missed anything or misinterpreted your comments.
Thanks!
-
Devang




More information about the llvm-commits mailing list