[llvm] r284887 - Switch SmallSetVector to use DenseSet when it overflows its inline space.

Sat Oct 22 23:02:22 PDT 2016

> On Oct 22, 2016, at 11:00 PM, Quentin Colombet via llvm-commits <llvm-commits at lists.llvm.org> wrote:
> 
> I believe right now this is passed directly via a template, i.e., constant argument.
I haven’t checked though :P.

> We could probably do something statically, but I was afraid it starts to look ugly.
> 
>> On Oct 22, 2016, at 10:57 PM, escha at apple.com wrote:
>> 
>> Why not round up the initial bucket size to the nearest power of 2 when converting to a DenseMap?
>> 
>> —escha
>> 
>>> On Oct 22, 2016, at 10:01 PM, Quentin Colombet via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>> 
>>> Hi Justin,
>>> 
>>> This change has a nasty side effect that we should either document, or revert to the previous implementation.
>>> The DenseSet class has a requirement that the size of the initial bucket must be a power of two.
>>> 
>>> Therefore, if you create a SmallSectVector for a non-power of two, you get an assertion later on in the initialization of the DenseMap.
>>> In release mode, the compiler crashes in weird way, but at this point that’s expected.
>>> If you want to reproduce the problem, just create a small program with a SmallSetVector of 5.
>>> 
>>> I am not sure I like that we impose this limitation on the size but given we already do on the DenseMap I can live with that.
>>> At the very least, this change should be documented in the API.
>>> 
>>> LLVM is fine in using only power of 2 sizes, but we don’t know about external users.
>>> 
>>> Cheers,
>>> -Quentin
>>> 
>>>> On Oct 21, 2016, at 2:45 PM, Justin Lebar via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>>>> 
>>>> Author: jlebar
>>>> Date: Fri Oct 21 16:45:01 2016
>>>> New Revision: 284887
>>>> 
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=284887&view=rev
>>>> Log:
>>>> Switch SmallSetVector to use DenseSet when it overflows its inline space.
>>>> 
>>>> Summary:
>>>> SetVector already used DenseSet, but SmallSetVector used std::set.  This
>>>> leads to surprising performance differences.  Moreover, it means that
>>>> the set of key types accepted by SetVector and SmallSetVector are
>>>> quite different!
>>>> 
>>>> In order to make this change, we had to convert some callsites that used
>>>> SmallSetVector<std::string, N> to use SmallSetVector<CachedHashString, N>
>>>> instead.
>>>> 
>>>> Reviewers: timshen
>>>> 
>>>> Subscribers: llvm-commits
>>>> 
>>>> Differential Revision: https://reviews.llvm.org/D25648
>>>> 
>>>> Modified:
>>>> llvm/trunk/include/llvm/ADT/SetVector.h
>>>> llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp
>>>> llvm/trunk/utils/TableGen/FixedLenDecoderEmitter.cpp
>>>> 
>>>> Modified: llvm/trunk/include/llvm/ADT/SetVector.h
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/ADT/SetVector.h?rev=284887&r1=284886&r2=284887&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/include/llvm/ADT/SetVector.h (original)
>>>> +++ llvm/trunk/include/llvm/ADT/SetVector.h Fri Oct 21 16:45:01 2016
>>>> @@ -282,7 +282,8 @@ private:
>>>> /// \brief A SetVector that performs no allocations if smaller than
>>>> /// a certain size.
>>>> template <typename T, unsigned N>
>>>> -class SmallSetVector : public SetVector<T, SmallVector<T, N>, SmallSet<T, N> > {
>>>> +class SmallSetVector
>>>> +    : public SetVector<T, SmallVector<T, N>, SmallDenseSet<T, N>> {
>>>> public:
>>>> SmallSetVector() {}
>>>> 
>>>> 
>>>> Modified: llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp?rev=284887&r1=284886&r2=284887&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp (original)
>>>> +++ llvm/trunk/utils/TableGen/AsmMatcherEmitter.cpp Fri Oct 21 16:45:01 2016
>>>> @@ -97,6 +97,7 @@
>>>> //===----------------------------------------------------------------------===//
>>>> 
>>>> #include "CodeGenTarget.h"
>>>> +#include "llvm/ADT/CachedHashString.h"
>>>> #include "llvm/ADT/PointerUnion.h"
>>>> #include "llvm/ADT/STLExtras.h"
>>>> #include "llvm/ADT/SmallPtrSet.h"
>>>> @@ -1822,10 +1823,11 @@ void MatchableInfo::buildAliasResultOper
>>>> }
>>>> }
>>>> 
>>>> -static unsigned getConverterOperandID(const std::string &Name,
>>>> -                                      SmallSetVector<std::string, 16> &Table,
>>>> -                                      bool &IsNew) {
>>>> -  IsNew = Table.insert(Name);
>>>> +static unsigned
>>>> +getConverterOperandID(const std::string &Name,
>>>> +                      SmallSetVector<CachedHashString, 16> &Table,
>>>> +                      bool &IsNew) {
>>>> +  IsNew = Table.insert(CachedHashString(Name));
>>>> 
>>>> unsigned ID = IsNew ? Table.size() - 1 : find(Table, Name) - Table.begin();
>>>> 
>>>> @@ -1838,8 +1840,8 @@ static void emitConvertFuncs(CodeGenTarg
>>>>                           std::vector<std::unique_ptr<MatchableInfo>> &Infos,
>>>>                           bool HasMnemonicFirst, bool HasOptionalOperands,
>>>>                           raw_ostream &OS) {
>>>> -  SmallSetVector<std::string, 16> OperandConversionKinds;
>>>> -  SmallSetVector<std::string, 16> InstructionConversionKinds;
>>>> +  SmallSetVector<CachedHashString, 16> OperandConversionKinds;
>>>> +  SmallSetVector<CachedHashString, 16> InstructionConversionKinds;
>>>> std::vector<std::vector<uint8_t> > ConversionTable;
>>>> size_t MaxRowLength = 2; // minimum is custom converter plus terminator.
>>>> 
>>>> @@ -1911,9 +1913,9 @@ static void emitConvertFuncs(CodeGenTarg
>>>> 
>>>> // Pre-populate the operand conversion kinds with the standard always
>>>> // available entries.
>>>> -  OperandConversionKinds.insert("CVT_Done");
>>>> -  OperandConversionKinds.insert("CVT_Reg");
>>>> -  OperandConversionKinds.insert("CVT_Tied");
>>>> +  OperandConversionKinds.insert(CachedHashString("CVT_Done"));
>>>> +  OperandConversionKinds.insert(CachedHashString("CVT_Reg"));
>>>> +  OperandConversionKinds.insert(CachedHashString("CVT_Tied"));
>>>> enum { CVT_Done, CVT_Reg, CVT_Tied };
>>>> 
>>>> for (auto &II : Infos) {
>>>> @@ -1925,13 +1927,13 @@ static void emitConvertFuncs(CodeGenTarg
>>>>    II->ConversionFnKind = Signature;
>>>> 
>>>>    // Check if we have already generated this signature.
>>>> -      if (!InstructionConversionKinds.insert(Signature))
>>>> +      if (!InstructionConversionKinds.insert(CachedHashString(Signature)))
>>>>      continue;
>>>> 
>>>>    // Remember this converter for the kind enum.
>>>>    unsigned KindID = OperandConversionKinds.size();
>>>> -      OperandConversionKinds.insert("CVT_" +
>>>> -                                    getEnumNameForToken(AsmMatchConverter));
>>>> +      OperandConversionKinds.insert(
>>>> +          CachedHashString("CVT_" + getEnumNameForToken(AsmMatchConverter)));
>>>> 
>>>>    // Add the converter row for this instruction.
>>>>    ConversionTable.emplace_back();
>>>> @@ -2113,7 +2115,7 @@ static void emitConvertFuncs(CodeGenTarg
>>>> 
>>>>  // Save the signature. If we already have it, don't add a new row
>>>>  // to the table.
>>>> -    if (!InstructionConversionKinds.insert(Signature))
>>>> +    if (!InstructionConversionKinds.insert(CachedHashString(Signature)))
>>>>    continue;
>>>> 
>>>>  // Add the row to the table.
>>>> @@ -2130,14 +2132,14 @@ static void emitConvertFuncs(CodeGenTarg
>>>> 
>>>> // Output the operand conversion kind enum.
>>>> OS << "enum OperatorConversionKind {\n";
>>>> -  for (const std::string &Converter : OperandConversionKinds)
>>>> +  for (const auto &Converter : OperandConversionKinds)
>>>>  OS << "  " << Converter << ",\n";
>>>> OS << "  CVT_NUM_CONVERTERS\n";
>>>> OS << "};\n\n";
>>>> 
>>>> // Output the instruction conversion kind enum.
>>>> OS << "enum InstructionConversionKind {\n";
>>>> -  for (const std::string &Signature : InstructionConversionKinds)
>>>> +  for (const auto &Signature : InstructionConversionKinds)
>>>>  OS << "  " << Signature << ",\n";
>>>> OS << "  CVT_NUM_SIGNATURES\n";
>>>> OS << "};\n\n";
>>>> 
>>>> Modified: llvm/trunk/utils/TableGen/FixedLenDecoderEmitter.cpp
>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/utils/TableGen/FixedLenDecoderEmitter.cpp?rev=284887&r1=284886&r2=284887&view=diff
>>>> ==============================================================================
>>>> --- llvm/trunk/utils/TableGen/FixedLenDecoderEmitter.cpp (original)
>>>> +++ llvm/trunk/utils/TableGen/FixedLenDecoderEmitter.cpp Fri Oct 21 16:45:01 2016
>>>> @@ -14,6 +14,7 @@
>>>> 
>>>> #include "CodeGenTarget.h"
>>>> #include "llvm/ADT/APInt.h"
>>>> +#include "llvm/ADT/CachedHashString.h"
>>>> #include "llvm/ADT/SmallString.h"
>>>> #include "llvm/ADT/StringExtras.h"
>>>> #include "llvm/ADT/StringRef.h"
>>>> @@ -66,8 +67,8 @@ typedef std::vector<uint8_t> DecoderTabl
>>>> typedef uint32_t DecoderFixup;
>>>> typedef std::vector<DecoderFixup> FixupList;
>>>> typedef std::vector<FixupList> FixupScopeList;
>>>> -typedef SmallSetVector<std::string, 16> PredicateSet;
>>>> -typedef SmallSetVector<std::string, 16> DecoderSet;
>>>> +typedef SmallSetVector<CachedHashString, 16> PredicateSet;
>>>> +typedef SmallSetVector<CachedHashString, 16> DecoderSet;
>>>> struct DecoderTableInfo {
>>>> DecoderTable Table;
>>>> FixupScopeList FixupStack;
>>>> @@ -1106,7 +1107,7 @@ unsigned FilterChooser::getDecoderIndex(
>>>> // overkill for now, though.
>>>> 
>>>> // Make sure the predicate is in the table.
>>>> -  Decoders.insert(StringRef(Decoder));
>>>> +  Decoders.insert(CachedHashString(Decoder));
>>>> // Now figure out the index for when we write out the table.
>>>> DecoderSet::const_iterator P = find(Decoders, Decoder.str());
>>>> return (unsigned)(P - Decoders.begin());
>>>> @@ -1179,9 +1180,9 @@ unsigned FilterChooser::getPredicateInde
>>>> // overkill for now, though.
>>>> 
>>>> // Make sure the predicate is in the table.
>>>> -  TableInfo.Predicates.insert(Predicate.str());
>>>> +  TableInfo.Predicates.insert(CachedHashString(Predicate));
>>>> // Now figure out the index for when we write out the table.
>>>> -  PredicateSet::const_iterator P = find(TableInfo.Predicates, Predicate.str());
>>>> +  PredicateSet::const_iterator P = find(TableInfo.Predicates, Predicate);
>>>> return (unsigned)(P - TableInfo.Predicates.begin());
>>>> }
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits