[cfe-commits] [Patch] -Wduplicate-enum which fixes PR6343

Fri Aug 17 22:00:03 PDT 2012

BTW, I wrote this two days ago.  For some reason my mail client didn't send it out until now.  My apologies for the delay.

On Aug 15, 2012, at 10:11 PM, Ted Kremenek <kremenek at apple.com> wrote:

> On Aug 15, 2012, at 6:12 PM, Richard Trieu <rtrieu at google.com> wrote:
> 
>> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <kremenek at apple.com> wrote:
>> On Aug 14, 2012, at 2:32 PM, Richard Trieu <rtrieu at google.com> wrote:
>> 
>>> At a high level, I honestly find this logic to be more complicated than I would have expected.  The sorting seems unnecessary, and will report diagnostics in an unnatural order (first based on enum constant value, then on declaration order).  A straight linear pass seems more naturally to me, and DenseMap is very efficient.
>>> Is there a comparison between the different containers in LLVM and the STL containers?
>> 
>> This is a reasonable place to start:
>> 
>>   http://llvm.org/docs/ProgrammersManual.html#ds_map
>> 
>> The key with DenseMap is that it is probed hashtable.  There is one big allocation for the entire table, instead of a bunch of buckets.  When applicable, it can be very fast, and feels like the right data structure to use here.
>> 
>> Duplicate enum detection, now with DenseMap.  The DenseMap maps a int64_t to a vector pointer.  0 and 1 were special keys for the DenseMap, so two separate pointers special cased for them.   The vectors pointers are stored in another vector in declaration order.  One pass is made over the enums to find ones without initializers.  These are used to create vectors.  A second pass through the enums populates the vectors.  Finally, a pass over the vector of vectors is used to generate all the warnings and notes.
>> 
>> Run time is fairly consistent with the sorted vector implementation, which is max %3 difference against control.
>> <duplicate-enum-densemap.patch>
> 
> Thanks for working on this.  My main concern is this patch now has a lot of unnecessary malloc() traffic, which will certainly slow it down.  Comments inline:
> 
>> +
>> +static int64_t GetInt64(const llvm::APSInt& Val) {
>> +  return  Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue();
>> +}
>> +
>> +struct DenseMapInfoint64_t {
>> +  static int64_t getEmptyKey() { return 0; }
>> +  static int64_t getTombstoneKey() { return 1; }
>> +  static unsigned getHashValue(const int64_t Val) {
>> +    return (unsigned)(Val * 37);
>> +  }
>> +  static bool isEqual(const int64_t& LHS, const int64_t& RHS) {
>> +    return LHS == RHS;
>> +  }
>> +};
> 
> This trait class doesn't look like it was actually used.  The DenseMap below just uses the default trait for int64_t.
> 
> I also still think we can so something a bit smarter here.  What I think we need to distinguish between is whether or not a constant has appeared more than once.  We're saving a bit of memory on the keys, but spending that savings elsewhere when we allocate the vectors unconditionally for each constant.
> 
>> +
>> +// Emits a warning when an element is implicitly set a value that
>> +// a previous element has already been set to.
>> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements,
>> +                                        unsigned NumElements, EnumDecl *Enum,
>> +                                        QualType EnumType) {
>> +  if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values,
>> +                                 Enum->getLocation()) ==
>> +      DiagnosticsEngine::Ignored)
>> +    return;
>> +  // Avoid anonymous enums
>> +  if (!Enum->getIdentifier())
>> +    return;
>> +
>> +  // Only check for small enums.
>> +  if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() > 64)
>> +    return;
>> +
>> +  typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector;
>> +  typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap;
>> +  typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector;
>> +  ValueToVectorMap EnumMap;
>> +  DoubleVector EnumVector;
>> +  SameValueVector *ZeroVector = 0, *OneVector = 0;
> 
> It took me a while to understand what this was doing, so I feel it could really benefit from a comment.  This also appears to result in a ton of malloc traffic below.  Here's my suggestion:
> 
>   typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector;
>   typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector;
> 
>   typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *> DeclOrVector;
>   typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap;
> 
>   DuplicatesVector DupVector;
>   ValueToVectorMap EnumMap;
> 
> The trick here is that the DenseMap maps from a constant to the first EnumConstantDecl it encounters.  Only if we encounter a second EnumConstantDecl with the same enum value do we pay the cost of allocating another vector.  This will drastically optimize in the common case, as calling malloc() is really slow.  Right now the code appears to be doing a malloc() for every enum constant, which is going to really penalize us here.
> 
>> +
>> +  for (unsigned i = 0; i < NumElements; ++i) {
>> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> +    if (!ECD) {
>> +      for (DoubleVector::iterator I = EnumVector.begin(), E = EnumVector.end();
>> +           I != E; ++I)
>> +        delete *I;
>> +      return;
>> +    }
> 
> I don't quite understand this loop through DoubleVector here, but it looks like logic in case we want to return early and cleanup.  Is there a case where the EnumConstantDecl can be null?
> 
>> +
>> +    if (ECD->getInitExpr())
>> +      continue;
>> +
>> +    int64_t Val = GetInt64(ECD->getInitVal());
>> +
> 
> Looks good.
> 
>> +    if (Val == 0) {
>> +      if (ZeroVector) continue;
>> +      ZeroVector = new SameValueVector();
>> +      ZeroVector->push_back(ECD);
>> +      EnumVector.push_back(ZeroVector);
>> +    } else if (Val == 1) {
>> +      if (OneVector) continue;
>> +      OneVector = new SameValueVector();
>> +      OneVector->push_back(ECD);
>> +      EnumVector.push_back(OneVector);
>> +    } else {
>> +      if (EnumMap.find(Val) != EnumMap.end())
>> +        continue;
>> +      SameValueVector *ValueVector = new SameValueVector();
>> +      ValueVector->push_back(ECD);
>> +      EnumVector.push_back(ValueVector);
>> +      EnumMap.insert(std::make_pair(Val, ValueVector));
> 
> The "find()" followed by the "insert()" is wasteful.  It results in two lookups to the hash table when we could have just used one.  More on that later.
> 
>> +    }
>> +  }
> 
> IMO, this looks like a lot of complexity just to handle the fact that 0 and 1 are special values for the DenseMap.  I don't really see this as the right tradeoff; the code is more complicated with marginal impact on memory usage or performance.
> 
> If you humor me for a bit, consider using something else for the key, e.g.:
> 
> struct DupKey {
>   int64_t val;
>   bool isTombstoneOrEmptyKey;
> };
> 
> The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or val = 1 to represent empty keys or tombstone entries.  Otherwise, it's an int64_t, with the full range of values.  We can define a DenseMap trait to do the right thing.  Yes, this costs a tiny bit more in storage, but it allows the data structure to handle the complete set of values in your domain, instead of resorting to complicating the core algorithm.  What I see here now is the same code essentially duplicated twice, which makes it harder to read and more error prone.
> 
> If we use DupKey as our key for the DenseMap, we can instead do something like this:
> 
>    DeclOrVector &entry = EnumMap[Val];  // Use default construction of 'entry'.
>    // Is the first time we encountered this constant?
>    if (entry.isNull()) {
>      entry = ECD;
>      continue;
>    }
>    // Is this the second time we encountered this constant?  If so,
>    // push the previous decl encountered and the one just encountered
>    // to a vector of duplicates.
>    if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) {
>      ECDVector *Vec = new ECDVector();
>      Vec->push_back(D);
>      Vec->push_back(ECD);
>      
>      // Update the entry to refer to the duplicates.
>      entry = Vec;
> 
>      // Store the duplicates in a vector we can consult later for
>      // quick emission of diagnostics.
>      DupVector.push_back(Vec);
> 
>      // On to the next constant.
>      continue;
>    }
>    // Is this the third (or greater) time we encountered the constant?  If so,
>    // continue to add it to the existing vector.
>    ECDVector *Vec = entry.get<ECDVector*>();
>    Vec->push_back(ECD);
> 
> 
> With this code, we only allocate memory (beyond the DenseMap) when we encounter a duplicate that would be worth reporting.  In the common case, this savings in malloc traffic should be noticeable.
> 
> Notice also that I used:
> 
>      DeclOrVector &entry = EnumMap[Val];  // Use default construction of 'entry'.
> 
> This results in a single lookup in the hashtable.  Since we plan on adding a value for a key no matter what, by using this idiom we allow the DenseMap to default construct an entry if it doesn't exist.  This results in a single hashtable lookup, from which we can modify the value in place.  This is obviously faster than doing a hashtable lookup twice. 
> 
>> +
>> +  for (unsigned i = 0; i < NumElements; ++i) {
>> +    EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> +    if (!ValidDuplicateEnum(ECD, Enum))
>> +      continue;
>> +
>> +    int64_t Val = GetInt64(ECD->getInitVal());
>> +
>> +    if (Val == 0) {
>> +      if (!ZeroVector || *ZeroVector->begin() == ECD)
>> +        continue;
>> +      ZeroVector->push_back(ECD);
>> +    } else if (Val == 1) {
>> +      if (!OneVector || *OneVector->begin() == ECD)
>> +        continue;
>> +      OneVector->push_back(ECD);
>> +    } else {
>> +      ValueToVectorMap::iterator I = EnumMap.find(Val);
>> +      if (I == EnumMap.end())
>> +        continue;
>> +      SameValueVector *V = I->second;
>> +      if (*V->begin() == ECD)
>> +        continue;
>> +      V->push_back(ECD);
>> +    }
>> +  }
> 
> This second loop looks unnecessary.  I think we can do everything we need to count duplicates with one loop.  Of course the ValidDuplicateEnum() would need to be hoisted to the first loop.
> 
>> +
>> +  for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(),
>> +                              DoubleVectorEnd = EnumVector.end();
>> +       DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) {
>> +    SameValueVector *V = *DoubleVectorIter;
>> +    if (V->size() == 1)
>> +      continue;
>> +
>> +    SameValueVector::iterator I = V->begin();
>> +    S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values)
>> +      << (*I)->getName() << (*I)->getInitVal().toString(10)
>> +      << (*I)->getSourceRange();
>> +    ++I;
>> +    for (SameValueVector::iterator E = V->end(); I != E; ++I)
>> +      S.Diag((*I)->getLocation(), diag::note_duplicate_element)
>> +        << (*I)->getName() << (*I)->getInitVal().toString(10)
>> +        << (*I)->getSourceRange();
>> +    delete V;
>> +  }
> 
> 
> This is more or less the same, essentially it becomes:
> 
> for (DuplicateVector::iterator I = DupVector.begin(), E = DupVector.end(); I != E; ++I) {
>    ECDVector *Vec = *I;
>    // do the diagnostic logic ...
>    delete *I;
> }
> 
> Note that with my suggestions the vector has size on order of the number of duplicate constants, not the number of total constants.  If there are no duplicates, no work is required (including free'ing memory).
> 
>> +}
>> +
>>  void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation LBraceLoc,
>>                           SourceLocation RBraceLoc, Decl *EnumDeclX,
>>                           Decl **Elements, unsigned NumElements,
>> @@ -10709,6 +10868,7 @@
>>      DeclsInPrototypeScope.push_back(Enum);
>>  
>>    CheckForUniqueEnumValues(*this, Elements, NumElements, Enum, EnumType);
>> +  CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum, EnumType);
>>  }
>>  
>>  Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,
> 
> I know this may all be nit-picky, but I really think trying to reduce the malloc() traffic is worth looking at to get a real understanding of the performance improvement that can be found here.
> 
> Thanks for forging ahead on this.
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120817/2ded43ee/attachment.html>