[PATCH] D40731: Integrate CHash into CLang

Christian Dietrich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Mar 12 08:43:16 PDT 2018


stettberger added inline comments.


================
Comment at: include/clang/AST/CHashVisitor.h:72-79
+  template<typename T>
+  void addData(const llvm::iterator_range<T> &x) {
+    addData(std::distance(x.begin(), x.end()));
+  }
+  template<typename T>
+  void addData(const llvm::ArrayRef<T> &x) {
+    addData(x.size());
----------------
rtrieu wrote:
> Should these also be storing the hashes of the elements in addition to storing the size?
Ok. So this is called from 

- FunctionProtoTypes { addData(S->exceptions()); }
- DecompositionDecl { addData(S->bindings()); }
- ObjCObjectType      { addData(S->getTypeArgsAsWritten()); }

All of these occurrences are vistied by the RecursiveASTVisitor. However, we have no Idea how many children are visited by these for loops. Therefore, we additionally call addData(ArrayRef<...> x)  to indicate give the actual user of StmtCollector* the possibility to add these lengths to the hash. This is also the reason, why we add only the std::distance to the length. I will add a comment to this.


================
Comment at: include/clang/AST/CHashVisitor.h:211
+
+  bool VisitTypeDecl(TypeDecl *D) {
+    // If we would hash the resulting type for a typedef, we
----------------
rtrieu wrote:
> I thought the idea was to have all the code in the *Collectors.td but here you have a bunch of Visit* functions for AST nodes.  This seems counter to your point of having them generated.
See Mail.


================
Comment at: include/clang/AST/CHashVisitor.h:212-213
+  bool VisitTypeDecl(TypeDecl *D) {
+    // If we would hash the resulting type for a typedef, we
+    // would get into an endless recursion.
+    if (!isa<TypedefNameDecl>(D) && !isa<RecordDecl>(D) && !isa<EnumDecl>(D)) {
----------------
rtrieu wrote:
> I don't see this example in the test.
It is now reflected in the tests.


================
Comment at: include/clang/AST/CHashVisitor.h:226
+    }
+    if (isa<VarDecl>(ValDecl)) {
+      /* We emulate TraverseDecl here for VarDecl, because we
----------------
rtrieu wrote:
> Can't you query the hash for the Decl instead of doing work here?
How would I do it. As the example shows, there is no hash for a, when we try to calculate the hash for a. However, I included a  unittest to cover this.


================
Comment at: include/clang/AST/CHashVisitor.h:253
+  bool VisitValueDecl(ValueDecl *D) {
+    /* Field Declarations can induce recursions */
+    if (isa<FieldDecl>(D)) {
----------------
rtrieu wrote:
> Is this possible recursion included in your test?
It  now is.


================
Comment at: include/clang/AST/CHashVisitor.h:255
+    if (isa<FieldDecl>(D)) {
+      addData(std::string(D->getType().getAsString()));
+    } else {
----------------
rtrieu wrote:
> Shouldn't this be handled in the VisitFieldDecl function?
I must avoid to call addData(D->getType()) in VisitValueDecl for FieldDeclarations. So there would also be a conditional.


================
Comment at: include/clang/AST/DeclDataCollectors.td:5-7
+      // Every Declaration gets a tag field in the hash stream. It is
+      // hashed to add additional randomness to the hash
+      addData(llvm::hash_value(S->getKind()));
----------------
rtrieu wrote:
> Why is this randomness needed?
It is there for my personal feeling of less hash collisions. In former versions of this patch there was a large random constant for every type of AST node to decrease the likelyness of collisions when we left out some information. Small numbers (like enum values) often occur in source code. A 32 Bit Pseudo-Random value is unlikely to be seen in real source code. However, if you dislike its, I can remove the llvm::hash_value here.


================
Comment at: include/clang/AST/DeclDataCollectors.td:9
+
+      // CrossRef
+      addData(S->hasAttrs());
----------------
rtrieu wrote:
> What do you mean by "CrossRef"?
CrossRef indicates that the following fields are not local to the node. However, we give them to the user of the DataCollector to give it a chance to include the number of children into its hash. This directly relates to the addData(ArrayRef<...>) question above.


Repository:
  rC Clang

https://reviews.llvm.org/D40731





More information about the cfe-commits mailing list