[cfe-commits] [Patch] -Wduplicate-enum which fixes PR6343
Richard Trieu
rtrieu at google.com
Tue Aug 28 11:48:58 PDT 2012
New patch with PointerUnion and DenseMap is slightly faster than the
previous DenseMap patch.
On Mon, Aug 27, 2012 at 9:51 PM, Ted Kremenek <kremenek at apple.com> wrote:
> Thanks! Quick question before I review it in more details: what is the
> performance characteristics of this patch compared to the others?
>
> On Aug 27, 2012, at 11:35 AM, Richard Trieu <rtrieu at google.com> wrote:
>
> Incorporated most of the suggestions into this patch. Still using a
> double pass over the constants for the reasons outlined below.
>
> On Fri, Aug 17, 2012 at 10:00 PM, Ted Kremenek <kremenek at apple.com> wrote:
>
>> BTW, I wrote this two days ago. For some reason my mail client didn't
>> send it out until now. My apologies for the delay.
>>
> No worries. Had some issues that cropped up on template diffing that took
> my time.
>
>>
>> On Aug 15, 2012, at 10:11 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>
>> On Aug 15, 2012, at 6:12 PM, Richard Trieu <rtrieu at google.com> wrote:
>>
>> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>
>>> On Aug 14, 2012, at 2:32 PM, Richard Trieu <rtrieu at google.com> wrote:
>>>
>>> At a high level, I honestly find this logic to be more complicated
>>>> than I would have expected. The sorting seems unnecessary, and will report
>>>> diagnostics in an unnatural order (first based on enum constant value, then
>>>> on declaration order). A straight linear pass seems more naturally to me,
>>>> and DenseMap is very efficient.
>>>>
>>> Is there a comparison between the different containers in LLVM and the
>>> STL containers?
>>>
>>>
>>> This is a reasonable place to start:
>>>
>>> http://llvm.org/docs/ProgrammersManual.html#ds_map
>>>
>>> The key with DenseMap is that it is probed hashtable. There is one big
>>> allocation for the entire table, instead of a bunch of buckets. When
>>> applicable, it can be very fast, and feels like the right data structure to
>>> use here.
>>>
>>
>> Duplicate enum detection, now with DenseMap. The DenseMap maps a int64_t
>> to a vector pointer. 0 and 1 were special keys for the DenseMap, so two
>> separate pointers special cased for them. The vectors pointers are stored
>> in another vector in declaration order. One pass is made over the enums to
>> find ones without initializers. These are used to create vectors. A
>> second pass through the enums populates the vectors. Finally, a pass over
>> the vector of vectors is used to generate all the warnings and notes.
>>
>> Run time is fairly consistent with the sorted vector implementation,
>> which is max %3 difference against control.
>> <duplicate-enum-densemap.patch>
>>
>>
>> Thanks for working on this. My main concern is this patch now has a lot
>> of unnecessary malloc() traffic, which will certainly slow it down.
>> Comments inline:
>>
>> +
>> +static int64_t GetInt64(const llvm::APSInt& Val) {
>> + return Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue();
>> +}
>> +
>> +struct DenseMapInfoint64_t {
>> + static int64_t getEmptyKey() { return 0; }
>> + static int64_t getTombstoneKey() { return 1; }
>> + static unsigned getHashValue(const int64_t Val) {
>> + return (unsigned)(Val * 37);
>> + }
>> + static bool isEqual(const int64_t& LHS, const int64_t& RHS) {
>> + return LHS == RHS;
>> + }
>> +};
>>
>>
>> This trait class doesn't look like it was actually used. The DenseMap
>> below just uses the default trait for int64_t.
>>
>> I also still think we can so something a bit smarter here. What I think
>> we need to distinguish between is whether or not a constant has appeared
>> more than once. We're saving a bit of memory on the keys, but spending
>> that savings elsewhere when we allocate the vectors unconditionally for
>> each constant.
>>
>> +
>> +// Emits a warning when an element is implicitly set a value that
>> +// a previous element has already been set to.
>> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements,
>> + unsigned NumElements, EnumDecl
>> *Enum,
>> + QualType EnumType) {
>> + if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values,
>> + Enum->getLocation()) ==
>> + DiagnosticsEngine::Ignored)
>> + return;
>> + // Avoid anonymous enums
>> + if (!Enum->getIdentifier())
>> + return;
>> +
>> + // Only check for small enums.
>> + if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() > 64)
>> + return;
>> +
>> + typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector;
>> + typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap;
>> + typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector;
>> + ValueToVectorMap EnumMap;
>> + DoubleVector EnumVector;
>> + SameValueVector *ZeroVector = 0, *OneVector = 0;
>>
>>
>> It took me a while to understand what this was doing, so I feel it could
>> really benefit from a comment. This also appears to result in a ton of
>> malloc traffic below. Here's my suggestion:
>>
>> typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector;
>> typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector;
>>
>> typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *> DeclOrVector;
>> typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap;
>>
>> DuplicatesVector DupVector;
>> ValueToVectorMap EnumMap;
>>
>> The trick here is that the DenseMap maps from a constant to the first
>> EnumConstantDecl it encounters. Only if we encounter a second
>> EnumConstantDecl with the same enum value do we pay the cost of allocating
>> another vector. This will drastically optimize in the common case, as
>> calling malloc() is really slow. Right now the code appears to be doing a
>> malloc() for every enum constant, which is going to really penalize us here.
>>
>> +
>> + for (unsigned i = 0; i < NumElements; ++i) {
>> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> + if (!ECD) {
>> + for (DoubleVector::iterator I = EnumVector.begin(), E =
>> EnumVector.end();
>> + I != E; ++I)
>> + delete *I;
>> + return;
>> + }
>>
>>
>> I don't quite understand this loop through DoubleVector here, but it
>> looks like logic in case we want to return early and cleanup. Is there a
>> case where the EnumConstantDecl can be null?
>>
>> According to ActOnEnumBody, EnumConstantDecl is null if a diagnostic has
> previously been emitted for the constant. Since the enum
> is possibly ill-formed, skip checking it.
>
>>
>> +
>> + if (ECD->getInitExpr())
>> + continue;
>> +
>> + int64_t Val = GetInt64(ECD->getInitVal());
>> +
>>
>>
>> Looks good.
>>
>> + if (Val == 0) {
>> + if (ZeroVector) continue;
>> + ZeroVector = new SameValueVector();
>> + ZeroVector->push_back(ECD);
>> + EnumVector.push_back(ZeroVector);
>> + } else if (Val == 1) {
>> + if (OneVector) continue;
>> + OneVector = new SameValueVector();
>> + OneVector->push_back(ECD);
>> + EnumVector.push_back(OneVector);
>> + } else {
>> + if (EnumMap.find(Val) != EnumMap.end())
>> + continue;
>> + SameValueVector *ValueVector = new SameValueVector();
>> + ValueVector->push_back(ECD);
>> + EnumVector.push_back(ValueVector);
>> + EnumMap.insert(std::make_pair(Val, ValueVector));
>>
>>
>> The "find()" followed by the "insert()" is wasteful. It results in two
>> lookups to the hash table when we could have just used one. More on that
>> later.
>>
>> + }
>> + }
>>
>>
>> IMO, this looks like a lot of complexity just to handle the fact that 0
>> and 1 are special values for the DenseMap. I don't really see this as the
>> right tradeoff; the code is more complicated with marginal impact on memory
>> usage or performance.
>>
>> If you humor me for a bit, consider using something else for the key,
>> e.g.:
>>
>> struct DupKey {
>> int64_t val;
>> bool isTombstoneOrEmptyKey;
>> };
>>
>> The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or val
>> = 1 to represent empty keys or tombstone entries. Otherwise, it's an
>> int64_t, with the full range of values. We can define a DenseMap trait to
>> do the right thing. Yes, this costs a tiny bit more in storage, but it
>> allows the data structure to handle the complete set of values in your
>> domain, instead of resorting to complicating the core algorithm. What I
>> see here now is the same code essentially duplicated twice, which makes it
>> harder to read and more error prone.
>>
>> If we use DupKey as our key for the DenseMap, we can instead do something
>> like this:
>>
>> DeclOrVector &entry = EnumMap[Val]; // Use default construction of
>> 'entry'.
>> // Is the first time we encountered this constant?
>> if (entry.isNull()) {
>> entry = ECD;
>> continue;
>> }
>> // Is this the second time we encountered this constant? If so,
>> // push the previous decl encountered and the one just encountered
>> // to a vector of duplicates.
>> if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) {
>> ECDVector *Vec = new ECDVector();
>> Vec->push_back(D);
>> Vec->push_back(ECD);
>>
>> // Update the entry to refer to the duplicates.
>> entry = Vec;
>>
>> // Store the duplicates in a vector we can consult later for
>> // quick emission of diagnostics.
>> DupVector.push_back(Vec);
>>
>> // On to the next constant.
>> continue;
>> }
>> // Is this the third (or greater) time we encountered the constant?
>> If so,
>> // continue to add it to the existing vector.
>> ECDVector *Vec = entry.get<ECDVector*>();
>> Vec->push_back(ECD);
>>
>>
>> With this code, we only allocate memory (beyond the DenseMap) when we
>> encounter a duplicate that would be worth reporting. In the common case,
>> this savings in malloc traffic should be noticeable.
>>
>> Notice also that I used:
>>
>> DeclOrVector &entry = EnumMap[Val]; // Use default construction of
>> 'entry'.
>>
>> This results in a single lookup in the hashtable. Since we plan on
>> adding a value for a key no matter what, by using this idiom we allow the
>> DenseMap to default construct an entry if it doesn't exist. This results
>> in a single hashtable lookup, from which we can modify the value in place.
>> This is obviously faster than doing a hashtable lookup twice.
>>
>> +
>> + for (unsigned i = 0; i < NumElements; ++i) {
>> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>> + if (!ValidDuplicateEnum(ECD, Enum))
>> + continue;
>> +
>> + int64_t Val = GetInt64(ECD->getInitVal());
>> +
>> + if (Val == 0) {
>> + if (!ZeroVector || *ZeroVector->begin() == ECD)
>> + continue;
>> + ZeroVector->push_back(ECD);
>> + } else if (Val == 1) {
>> + if (!OneVector || *OneVector->begin() == ECD)
>> + continue;
>> + OneVector->push_back(ECD);
>> + } else {
>> + ValueToVectorMap::iterator I = EnumMap.find(Val);
>> + if (I == EnumMap.end())
>> + continue;
>> + SameValueVector *V = I->second;
>> + if (*V->begin() == ECD)
>> + continue;
>> + V->push_back(ECD);
>> + }
>> + }
>>
>>
>> This second loop looks unnecessary. I think we can do everything we need
>> to count duplicates with one loop. Of course the ValidDuplicateEnum()
>> would need to be hoisted to the first loop.
>>
>> Using two traverses allows two things to happen. One, the first element
> in the ECDVector will not have an initializer and will work with the
> warning. Otherwise, the vector needs to be searched for a proper enum
> constant to use. Two, it prevents unneeded creation of ECDVectors. If we
> have enum A { A1 = 2, A2 = 2, A3 = 1, A4 = 1, A5}; vectors for values 1 and
> 2 are created using a single pass while only a vector for 2 will be created
> using a double pass.
>
>>
>> +
>> + for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(),
>> + DoubleVectorEnd = EnumVector.end();
>> + DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) {
>> + SameValueVector *V = *DoubleVectorIter;
>> + if (V->size() == 1)
>> + continue;
>> +
>> + SameValueVector::iterator I = V->begin();
>> + S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values)
>> + << (*I)->getName() << (*I)->getInitVal().toString(10)
>> + << (*I)->getSourceRange();
>> + ++I;
>> + for (SameValueVector::iterator E = V->end(); I != E; ++I)
>> + S.Diag((*I)->getLocation(), diag::note_duplicate_element)
>> + << (*I)->getName() << (*I)->getInitVal().toString(10)
>> + << (*I)->getSourceRange();
>> + delete V;
>> + }
>>
>>
>>
>> This is more or less the same, essentially it becomes:
>>
>> for (DuplicateVector::iterator I = DupVector.begin(), E =
>> DupVector.end(); I != E; ++I) {
>> ECDVector *Vec = *I;
>> // do the diagnostic logic ...
>> delete *I;
>> }
>>
>> Note that with my suggestions the vector has size on order of the number
>> of duplicate constants, not the number of total constants. If there are no
>> duplicates, no work is required (including free'ing memory).
>>
>> +}
>> +
>> void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation
>> LBraceLoc,
>> SourceLocation RBraceLoc, Decl *EnumDeclX,
>> Decl **Elements, unsigned NumElements,
>> @@ -10709,6 +10868,7 @@
>> DeclsInPrototypeScope.push_back(Enum);
>>
>> CheckForUniqueEnumValues(*this, Elements, NumElements, Enum, EnumType);
>> + CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum,
>> EnumType);
>> }
>>
>> Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,
>>
>>
>> I know this may all be nit-picky, but I really think trying to reduce the
>> malloc() traffic is worth looking at to get a real understanding of the
>> performance improvement that can be found here.
>>
>> Thanks for forging ahead on this.
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>
>>
>>
> <duplicate-enum-densemap2.patch>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120828/3dd38ccc/attachment.html>
More information about the cfe-commits
mailing list