[PATCH] InstrProf: Calculate a better function hash

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed Apr 16 09:17:49 PDT 2014


On 2014-Apr-15, at 23:28, Chandler Carruth <chandlerc at google.com> wrote:

> I don't really understand the use of RAV here,

I think Bob's idea in r206039 is to avoid repeating the traversal logic.
I like the direction myself.

> or the logic of having a secondary switch, but at this point, there are too many patches flying and I feel like there is some more extensive discussion that took place somewhere other than this thread. =/

Not really.  Just coordination around how to stage changes we knew would
conflict.  In the end we reversed them.

> No fun when you get bounced back and forth in code review.

I can take it :).

> I'm happy to go with the implementation strategy as you have it. I may even write up a patch to show how I'm imagining you might simplify this. Anyways...
> 
> Some more microscopic nits. Feel free to commit whenever.

r206397

> I like where this design has ended up, and thanks for seeing it through. =]

Thanks for your help designing the hash -- much better than where this
started off!

> +/// \brief Stable hasher for PGO region counters.
> +///
> +/// PGOHash produces a stable hash of a given function's control flow.
> +///
> +/// Changing the output of this hash will invalidate all previously generated
> +/// profiles -- i.e., don't do it.
> +///
> +/// \note  When this hash does eventually change (years?), we still need to
> +/// support old hashes.  We'll need to pull in the version number from the
> +/// profile data format and use the matching hash function.
> +class PGOHash {
> +  constexpr static int NumBitsPerType = 6;
> +  constexpr static unsigned NumTypesPerWord =
> +      sizeof(uint64_t) * 8 / NumBitsPerType;
> +  constexpr static unsigned TooBig = 1u << NumBitsPerType;
> 
> I would sink these to live down with the static_assert, and sink all of it into the existing private section with member variables.

Yup, better to merge the private sections here.

Also, turns out MSVC doesn't like constexpr; I switched to static const.

> +
> +public:
> +  /// \brief Hash values for AST nodes.
> +  ///
> +  /// Distinct values for AST nodes that have region counters attached.
> +  ///
> +  /// These values must be stable.  All new members must be added at the end,
> +  /// and no members should be removed.  Changing the enumeration value for an
> +  /// AST node will affect the hash of every function that contains that node.
> +  enum HashType : unsigned char {
> +    None = 0,
> +    LabelStmt = 1,
> +    WhileStmt,
> +    DoStmt,
> +    ForStmt,
> +    CXXForRangeStmt,
> +    ObjCForCollectionStmt,
> +    SwitchStmt,
> +    CaseStmt,
> +    DefaultStmt,
> +    IfStmt,
> +    CXXTryStmt,
> +    CXXCatchStmt,
> +    ConditionalOperator,
> +    BinaryOperatorLAnd,
> +    BinaryOperatorLOr,
> +    BinaryConditionalOperator,
> +
> +    // Keep this last.  It's for the static assert that follows.
> +    LastHashType
> +  };
> +  static_assert(LastHashType <= TooBig, "Too many types in HashType");
> +
> +private:
> +  uint64_t Working;
> +  unsigned Count;
> +  llvm::MD5 MD5;
> +
> +public:
> +  // TODO: When this format changes, take in a version number here, and use the
> +  // old hash calculation for file formats that used the old hash.
> +  PGOHash() : Working(0), Count(0) {}
> +  void combine(HashType Type);
> +  uint64_t finalize();
> 
> I would just put these up with the public enum. Excepting private types or enums which are necessary to define prior to the public interface (because it is defined in terms of them), I generally prefer the public interface at the top followed by the private interface. Mild preference though.

My preference is to put member variables before functions, but I
shuffled things a little to make it cleaner.

> +};




More information about the cfe-commits mailing list