[cfe-commits] [Patch] -Wduplicate-enum which fixes PR6343
Richard Trieu
rtrieu at google.com
Tue Aug 28 19:23:31 PDT 2012
On Tue, Aug 28, 2012 at 6:21 PM, Ted Kremenek <kremenek at apple.com> wrote:
> Hi Richard,
>
> If I read these numbers correctly, the hash table algorithm (with O(n)
> performance) takes about 1.6-2% percent more than the control for runs 1-3,
> and hardly anything noticeable for the clang code base. Were runs 1-3 used
> in your earlier measurements,
>
Yes, these are the same runs I have been using. I earlier did some more
runs with smaller files, but the improvements to this warning made the
differences too small to detect, so they were dropped.
> where the sorting-based approach took about ~4% longer (or is that not the
> correct number)?
>
That is correct. The fastest sorting-based reached 4% difference.
> Just wanted an idea of where we are compared to the earlier measurements.
>
> Ted
>
> On Aug 28, 2012, at 3:53 PM, Richard Trieu <rtrieu at google.com> wrote:
>
> Timing information:
> Three clangs were run, clang with no changes (control), duplicate enum
> with PointerUnion (most recent patch), duplicate enum with DenseMap without
> PointerUnion (next most recent patch). Each run with -fsyntax-only and
> -Wduplicate-enum for modified clangs. Runs 1, 2, and 3 are files with only
> enums. Run 4 is a preprocessed Clang.
>
> Key:
> name: Average (Min-Max)
>
> Run1:
> Control: 13.763 (13.66-14.14)
> PointerUnion: 14.046 (13.94-14.16)
> DenseMap: 14.304 (14.24-14.39)
>
> Run2:
> Control: 20.189 (20.1-20.31)
> PointerUnion: 20.514 (20.37-20.6)
> DenseMap: 20.635 (20.56-20.7)
>
> Run3:
> Control: 26.715 (26.66-26.8)
> PointerUnion: 26.928 (26.8-27.12)
> DenseMap: 27.13 (27.05-27.22)
>
> Run4:
> Control: 29.686 (28.98-30.39)
> PointerUnion: 29.706 (28.73-30.69)
> DenseMap: 29.952 (29.3-30.63)
>
> On Tue, Aug 28, 2012 at 2:12 PM, Ted Kremenek <kremenek at apple.com> wrote:
>
>> How fast? (particularly compared to the first implementation, and
>> relative to -fsyntax-only)
>>
>> On Aug 28, 2012, at 11:48 AM, Richard Trieu <rtrieu at google.com> wrote:
>>
>> New patch with PointerUnion and DenseMap is slightly faster than the
>> previous DenseMap patch.
>>
>> On Mon, Aug 27, 2012 at 9:51 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>
>>> Thanks! Quick question before I review it in more details: what is the
>>> performance characteristics of this patch compared to the others?
>>>
>>> On Aug 27, 2012, at 11:35 AM, Richard Trieu <rtrieu at google.com> wrote:
>>>
>>> Incorporated most of the suggestions into this patch. Still using a
>>> double pass over the constants for the reasons outlined below.
>>>
>>> On Fri, Aug 17, 2012 at 10:00 PM, Ted Kremenek <kremenek at apple.com>wrote:
>>>
>>>> BTW, I wrote this two days ago. For some reason my mail client didn't
>>>> send it out until now. My apologies for the delay.
>>>>
>>> No worries. Had some issues that cropped up on template diffing that
>>> took my time.
>>>
>>>>
>>>> On Aug 15, 2012, at 10:11 PM, Ted Kremenek <kremenek at apple.com> wrote:
>>>>
>>>> On Aug 15, 2012, at 6:12 PM, Richard Trieu <rtrieu at google.com> wrote:
>>>>
>>>> On Tue, Aug 14, 2012 at 9:48 PM, Ted Kremenek <kremenek at apple.com>wrote:
>>>>
>>>>> On Aug 14, 2012, at 2:32 PM, Richard Trieu <rtrieu at google.com> wrote:
>>>>>
>>>>> At a high level, I honestly find this logic to be more complicated
>>>>>> than I would have expected. The sorting seems unnecessary, and will report
>>>>>> diagnostics in an unnatural order (first based on enum constant value, then
>>>>>> on declaration order). A straight linear pass seems more naturally to me,
>>>>>> and DenseMap is very efficient.
>>>>>>
>>>>> Is there a comparison between the different containers in LLVM and the
>>>>> STL containers?
>>>>>
>>>>>
>>>>> This is a reasonable place to start:
>>>>>
>>>>> http://llvm.org/docs/ProgrammersManual.html#ds_map
>>>>>
>>>>> The key with DenseMap is that it is probed hashtable. There is one
>>>>> big allocation for the entire table, instead of a bunch of buckets. When
>>>>> applicable, it can be very fast, and feels like the right data structure to
>>>>> use here.
>>>>>
>>>>
>>>> Duplicate enum detection, now with DenseMap. The DenseMap maps a
>>>> int64_t to a vector pointer. 0 and 1 were special keys for the DenseMap,
>>>> so two separate pointers special cased for them. The vectors pointers are
>>>> stored in another vector in declaration order. One pass is made over the
>>>> enums to find ones without initializers. These are used to create vectors.
>>>> A second pass through the enums populates the vectors. Finally, a pass
>>>> over the vector of vectors is used to generate all the warnings and notes.
>>>>
>>>> Run time is fairly consistent with the sorted vector implementation,
>>>> which is max %3 difference against control.
>>>> <duplicate-enum-densemap.patch>
>>>>
>>>>
>>>> Thanks for working on this. My main concern is this patch now has a
>>>> lot of unnecessary malloc() traffic, which will certainly slow it down.
>>>> Comments inline:
>>>>
>>>> +
>>>> +static int64_t GetInt64(const llvm::APSInt& Val) {
>>>> + return Val.isSigned() ? Val.getSExtValue() : Val.getZExtValue();
>>>> +}
>>>> +
>>>> +struct DenseMapInfoint64_t {
>>>> + static int64_t getEmptyKey() { return 0; }
>>>> + static int64_t getTombstoneKey() { return 1; }
>>>> + static unsigned getHashValue(const int64_t Val) {
>>>> + return (unsigned)(Val * 37);
>>>> + }
>>>> + static bool isEqual(const int64_t& LHS, const int64_t& RHS) {
>>>> + return LHS == RHS;
>>>> + }
>>>> +};
>>>>
>>>>
>>>> This trait class doesn't look like it was actually used. The DenseMap
>>>> below just uses the default trait for int64_t.
>>>>
>>>> I also still think we can so something a bit smarter here. What I
>>>> think we need to distinguish between is whether or not a constant has
>>>> appeared more than once. We're saving a bit of memory on the keys, but
>>>> spending that savings elsewhere when we allocate the vectors
>>>> unconditionally for each constant.
>>>>
>>>> +
>>>> +// Emits a warning when an element is implicitly set a value that
>>>> +// a previous element has already been set to.
>>>> +static void CheckForDuplicateEnumValues(Sema &S, Decl **Elements,
>>>> + unsigned NumElements, EnumDecl
>>>> *Enum,
>>>> + QualType EnumType) {
>>>> + if (S.Diags.getDiagnosticLevel(diag::warn_duplicate_enum_values,
>>>> + Enum->getLocation()) ==
>>>> + DiagnosticsEngine::Ignored)
>>>> + return;
>>>> + // Avoid anonymous enums
>>>> + if (!Enum->getIdentifier())
>>>> + return;
>>>> +
>>>> + // Only check for small enums.
>>>> + if (Enum->getNumPositiveBits() > 63 || Enum->getNumNegativeBits() >
>>>> 64)
>>>> + return;
>>>> +
>>>> + typedef llvm::SmallVector<EnumConstantDecl*, 4> SameValueVector;
>>>> + typedef llvm::DenseMap<int64_t, SameValueVector*> ValueToVectorMap;
>>>> + typedef llvm::SmallVector<SameValueVector*, 10> DoubleVector;
>>>> + ValueToVectorMap EnumMap;
>>>> + DoubleVector EnumVector;
>>>> + SameValueVector *ZeroVector = 0, *OneVector = 0;
>>>>
>>>>
>>>> It took me a while to understand what this was doing, so I feel it
>>>> could really benefit from a comment. This also appears to result in a ton
>>>> of malloc traffic below. Here's my suggestion:
>>>>
>>>> typedef llvm::SmallVector<EnumConstantDecl*, 3> ECDVector;
>>>> typedef llvm::SmallVector<ECDVector *, 3> DuplicatesVector;
>>>>
>>>> typedef llvm::PointerUnion<EnumConstantDecl*, ECDVector *>
>>>> DeclOrVector;
>>>> typedef llvm::DenseMap<int64_t, DeclOrVector> ValueToVectorMap;
>>>>
>>>> DuplicatesVector DupVector;
>>>> ValueToVectorMap EnumMap;
>>>>
>>>> The trick here is that the DenseMap maps from a constant to the first
>>>> EnumConstantDecl it encounters. Only if we encounter a second
>>>> EnumConstantDecl with the same enum value do we pay the cost of allocating
>>>> another vector. This will drastically optimize in the common case, as
>>>> calling malloc() is really slow. Right now the code appears to be doing a
>>>> malloc() for every enum constant, which is going to really penalize us here.
>>>>
>>>> +
>>>> + for (unsigned i = 0; i < NumElements; ++i) {
>>>> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>>>> + if (!ECD) {
>>>> + for (DoubleVector::iterator I = EnumVector.begin(), E =
>>>> EnumVector.end();
>>>> + I != E; ++I)
>>>> + delete *I;
>>>> + return;
>>>> + }
>>>>
>>>>
>>>> I don't quite understand this loop through DoubleVector here, but it
>>>> looks like logic in case we want to return early and cleanup. Is there a
>>>> case where the EnumConstantDecl can be null?
>>>>
>>>> According to ActOnEnumBody, EnumConstantDecl is null if a diagnostic
>>> has previously been emitted for the constant. Since the enum
>>> is possibly ill-formed, skip checking it.
>>>
>>>>
>>>> +
>>>> + if (ECD->getInitExpr())
>>>> + continue;
>>>> +
>>>> + int64_t Val = GetInt64(ECD->getInitVal());
>>>> +
>>>>
>>>>
>>>> Looks good.
>>>>
>>>> + if (Val == 0) {
>>>> + if (ZeroVector) continue;
>>>> + ZeroVector = new SameValueVector();
>>>> + ZeroVector->push_back(ECD);
>>>> + EnumVector.push_back(ZeroVector);
>>>> + } else if (Val == 1) {
>>>> + if (OneVector) continue;
>>>> + OneVector = new SameValueVector();
>>>> + OneVector->push_back(ECD);
>>>> + EnumVector.push_back(OneVector);
>>>> + } else {
>>>> + if (EnumMap.find(Val) != EnumMap.end())
>>>> + continue;
>>>> + SameValueVector *ValueVector = new SameValueVector();
>>>> + ValueVector->push_back(ECD);
>>>> + EnumVector.push_back(ValueVector);
>>>> + EnumMap.insert(std::make_pair(Val, ValueVector));
>>>>
>>>>
>>>> The "find()" followed by the "insert()" is wasteful. It results in two
>>>> lookups to the hash table when we could have just used one. More on that
>>>> later.
>>>>
>>>> + }
>>>> + }
>>>>
>>>>
>>>> IMO, this looks like a lot of complexity just to handle the fact that 0
>>>> and 1 are special values for the DenseMap. I don't really see this as the
>>>> right tradeoff; the code is more complicated with marginal impact on memory
>>>> usage or performance.
>>>>
>>>> If you humor me for a bit, consider using something else for the key,
>>>> e.g.:
>>>>
>>>> struct DupKey {
>>>> int64_t val;
>>>> bool isTombstoneOrEmptyKey;
>>>> };
>>>>
>>>> The idea is if 'isTombStoneOrEmptyKey' is true, we can use val = 0 or
>>>> val = 1 to represent empty keys or tombstone entries. Otherwise, it's an
>>>> int64_t, with the full range of values. We can define a DenseMap trait to
>>>> do the right thing. Yes, this costs a tiny bit more in storage, but it
>>>> allows the data structure to handle the complete set of values in your
>>>> domain, instead of resorting to complicating the core algorithm. What I
>>>> see here now is the same code essentially duplicated twice, which makes it
>>>> harder to read and more error prone.
>>>>
>>>> If we use DupKey as our key for the DenseMap, we can instead do
>>>> something like this:
>>>>
>>>> DeclOrVector &entry = EnumMap[Val]; // Use default construction of
>>>> 'entry'.
>>>> // Is the first time we encountered this constant?
>>>> if (entry.isNull()) {
>>>> entry = ECD;
>>>> continue;
>>>> }
>>>> // Is this the second time we encountered this constant? If so,
>>>> // push the previous decl encountered and the one just encountered
>>>> // to a vector of duplicates.
>>>> if (EnumConstantDecl *D = entry.dyn_cast<EnumConstantDecl*>()) {
>>>> ECDVector *Vec = new ECDVector();
>>>> Vec->push_back(D);
>>>> Vec->push_back(ECD);
>>>>
>>>> // Update the entry to refer to the duplicates.
>>>> entry = Vec;
>>>>
>>>> // Store the duplicates in a vector we can consult later for
>>>> // quick emission of diagnostics.
>>>> DupVector.push_back(Vec);
>>>>
>>>> // On to the next constant.
>>>> continue;
>>>> }
>>>> // Is this the third (or greater) time we encountered the constant?
>>>> If so,
>>>> // continue to add it to the existing vector.
>>>> ECDVector *Vec = entry.get<ECDVector*>();
>>>> Vec->push_back(ECD);
>>>>
>>>>
>>>> With this code, we only allocate memory (beyond the DenseMap) when we
>>>> encounter a duplicate that would be worth reporting. In the common case,
>>>> this savings in malloc traffic should be noticeable.
>>>>
>>>> Notice also that I used:
>>>>
>>>> DeclOrVector &entry = EnumMap[Val]; // Use default construction
>>>> of 'entry'.
>>>>
>>>> This results in a single lookup in the hashtable. Since we plan on
>>>> adding a value for a key no matter what, by using this idiom we allow the
>>>> DenseMap to default construct an entry if it doesn't exist. This results
>>>> in a single hashtable lookup, from which we can modify the value in place.
>>>> This is obviously faster than doing a hashtable lookup twice.
>>>>
>>>> +
>>>> + for (unsigned i = 0; i < NumElements; ++i) {
>>>> + EnumConstantDecl *ECD = cast<EnumConstantDecl>(Elements[i]);
>>>> + if (!ValidDuplicateEnum(ECD, Enum))
>>>> + continue;
>>>> +
>>>> + int64_t Val = GetInt64(ECD->getInitVal());
>>>> +
>>>> + if (Val == 0) {
>>>> + if (!ZeroVector || *ZeroVector->begin() == ECD)
>>>> + continue;
>>>> + ZeroVector->push_back(ECD);
>>>> + } else if (Val == 1) {
>>>> + if (!OneVector || *OneVector->begin() == ECD)
>>>> + continue;
>>>> + OneVector->push_back(ECD);
>>>> + } else {
>>>> + ValueToVectorMap::iterator I = EnumMap.find(Val);
>>>> + if (I == EnumMap.end())
>>>> + continue;
>>>> + SameValueVector *V = I->second;
>>>> + if (*V->begin() == ECD)
>>>> + continue;
>>>> + V->push_back(ECD);
>>>> + }
>>>> + }
>>>>
>>>>
>>>> This second loop looks unnecessary. I think we can do everything we
>>>> need to count duplicates with one loop. Of course the ValidDuplicateEnum()
>>>> would need to be hoisted to the first loop.
>>>>
>>>> Using two traverses allows two things to happen. One, the first
>>> element in the ECDVector will not have an initializer and will work with
>>> the warning. Otherwise, the vector needs to be searched for a proper enum
>>> constant to use. Two, it prevents unneeded creation of ECDVectors. If we
>>> have enum A { A1 = 2, A2 = 2, A3 = 1, A4 = 1, A5}; vectors for values 1 and
>>> 2 are created using a single pass while only a vector for 2 will be created
>>> using a double pass.
>>>
>>>>
>>>> +
>>>> + for (DoubleVector::iterator DoubleVectorIter = EnumVector.begin(),
>>>> + DoubleVectorEnd = EnumVector.end();
>>>> + DoubleVectorIter != DoubleVectorEnd; ++DoubleVectorIter) {
>>>> + SameValueVector *V = *DoubleVectorIter;
>>>> + if (V->size() == 1)
>>>> + continue;
>>>> +
>>>> + SameValueVector::iterator I = V->begin();
>>>> + S.Diag((*I)->getLocation(), diag::warn_duplicate_enum_values)
>>>> + << (*I)->getName() << (*I)->getInitVal().toString(10)
>>>> + << (*I)->getSourceRange();
>>>> + ++I;
>>>> + for (SameValueVector::iterator E = V->end(); I != E; ++I)
>>>> + S.Diag((*I)->getLocation(), diag::note_duplicate_element)
>>>> + << (*I)->getName() << (*I)->getInitVal().toString(10)
>>>> + << (*I)->getSourceRange();
>>>> + delete V;
>>>> + }
>>>>
>>>>
>>>>
>>>> This is more or less the same, essentially it becomes:
>>>>
>>>> for (DuplicateVector::iterator I = DupVector.begin(), E =
>>>> DupVector.end(); I != E; ++I) {
>>>> ECDVector *Vec = *I;
>>>> // do the diagnostic logic ...
>>>> delete *I;
>>>> }
>>>>
>>>> Note that with my suggestions the vector has size on order of the
>>>> number of duplicate constants, not the number of total constants. If there
>>>> are no duplicates, no work is required (including free'ing memory).
>>>>
>>>> +}
>>>> +
>>>> void Sema::ActOnEnumBody(SourceLocation EnumLoc, SourceLocation
>>>> LBraceLoc,
>>>> SourceLocation RBraceLoc, Decl *EnumDeclX,
>>>> Decl **Elements, unsigned NumElements,
>>>> @@ -10709,6 +10868,7 @@
>>>> DeclsInPrototypeScope.push_back(Enum);
>>>>
>>>> CheckForUniqueEnumValues(*this, Elements, NumElements, Enum,
>>>> EnumType);
>>>> + CheckForDuplicateEnumValues(*this, Elements, NumElements, Enum,
>>>> EnumType);
>>>> }
>>>>
>>>> Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,
>>>>
>>>>
>>>> I know this may all be nit-picky, but I really think trying to reduce
>>>> the malloc() traffic is worth looking at to get a real understanding of the
>>>> performance improvement that can be found here.
>>>>
>>>> Thanks for forging ahead on this.
>>>> _______________________________________________
>>>> cfe-commits mailing list
>>>> cfe-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>
>>>>
>>>>
>>> <duplicate-enum-densemap2.patch>
>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120828/65c7f760/attachment.html>
More information about the cfe-commits
mailing list