[PATCH] D134235: [clang-doc] Clean up *Info constructors.

Brett Wilson via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 21 10:55:56 PDT 2022


brettw marked an inline comment as done.
brettw added a comment.

First, for the mechanical part of the change, this code implement multiple constructors:

  Info() = default;
  Info(InfoType IT) : IT(IT) {}
  Info(InfoType IT, SymbolID USR) : USR(USR), IT(IT) {}
  Info(InfoType IT, SymbolID USR, StringRef Name)
      : USR(USR), IT(IT), Name(Name) {}
  Info(InfoType IT, SymbolID USR, StringRef Name, StringRef Path)

is not common practice. This is an extremely verbose and error-prone way of implementing default arguments. It seems obvious to me that the one constructor that implements the exact same behavior is much more clear. Most of this patch is doing this mechanical change for these objects.

There are a few cases in this patch where I removed some constructors that took some "other" subset of parameters.

  Reference() = default;
  Reference(llvm::StringRef Name) : Name(Name) {}
  Reference(llvm::StringRef Name, StringRef Path)
      : Name(Name), Path(Path), IsInGlobalNamespace(Path.empty()) {}
  Reference(SymbolID USR, StringRef Name, InfoType IT)
      : USR(USR), Name(Name), RefType(IT) {}
  Reference(SymbolID USR, StringRef Name, InfoType IT, StringRef Path)
      : USR(USR), Name(Name), RefType(IT), Path(Path),
        IsInGlobalNamespace(Path.empty()) {}

Working on this code, I found the number of constructors with different parameters to make things more difficult to follow. It's harder to figure out what exactly a line of code is doing when there are so many variants you have to match against. It's also more difficult to figure out how to create one of these objects when you have to read so many variants to see what to call. This mental overhead may not be obvious when just looking at a diff, especially the test code where you may see an extra empty SID parameter inserted here and there. This new one tries to have a single constructor with default arguments from the right, with the exception of the "TypeInfo" that creates a single named type with no SymbolID because that's by far the most common variant in tests.

With this Reference constructor in particular, all of these variants hide a bug. If I didn't call that bug out in the change description, would you have seen it? Having a standard constructor makes this much easier to follow. Changing several callers to manually specify an empty SymbolID() seems like a small price to pay (I think this is the only case where any calls get "more complicated."

I tried to standardize on a single way to construct these objects. As an example, the TypeInfo is just a wrapper around a Reference. Reference has many constructors, not all of which are a subset of the others. Does this mean TypeInfo should get these same 5 constructor variants? The author said no, they picked 3 of the variants and implemented TypeInfo constructors, and used different names for some of the parameters which makes it even harder to follow. Continuing down the chain, FieldTypeInfo derives from TypeInfo, and MemberTypeInfo derives from FieldTypeInfo. Does this mean they should get the same 5 variants, plus initializers for their own values? I would say no. If these derived classes needed to specify the Reference information, they should take a Reference as the first parameter (they don't all need to do this today).

There's a problem with the current object model. The FieldTypeInfo and MemberTypeInfo derive from TypeInfo. But I think that this is wrong. The FieldTypeInfo isn't the type information for a field, it's really a "Field" and a MemberTypeInfo isn't the type info for a member, it's the "Member" itself. Conceptually, FieldTypeInfo should be a FieldInfo and have a struct `TypeInfo Type;` rather than using an "is a" relationship and deriving from a type. Given that, this object naming and constructor "should" be:

  MemberInfo(TypeInfo Type, StringRef Name, AccessSpecifier Access = Public)

This patch gets us closer to this, where we still have the weird "is a" derivation but the constructors match what I think they should be:

  MemberTypeInfo() = default;
  MemberTypeInfo(const TypeInfo &TI, StringRef Name, AccessSpecifier Access)

The parameters that define the implementation details of how the type is represented don't belong on this class: if the caller wants to create a type, they should create it using one of the appropriate TypeInfo constructors rather than each variant being duplicated here.



================
Comment at: clang-tools-extra/clang-doc/Representation.h:165
-      : Type(Type, Field, IT) {}
-  TypeInfo(SymbolID Type, StringRef Field, InfoType IT, StringRef Path)
-      : Type(Type, Field, IT, Path) {}
----------------
haowei wrote:
> The clean up here prevents setting a customized SymbolID for TypeInfo, which becomes a problem in the HTMLGeneratorTest, in which TypeInfo is constructed with an constant EmptySID. While the outcome is the same, it is not semantically equivalent.  
It does not prevent setting a custom SymbolID. The full constructor is the one above that takes a Reference. That is the one that's normally used in the production code; I added it in my previous patch. This patch is a followup from that one because I wanted to remove some now-redundant constructors but thought that I would separate it out to make it easier to review. The minority of cases in HTMLGeneratorTest that needed to pass more than the simple name/path have been updated to take a Reference.

The only code that changed in a way that you described is on line 373. But I'm not sure what you're arguing about "semantically equivalent". In this case, the caller wanted a "void" TypeInfo but tryped a bunch of unnecessary stuff. I think it's likely that the author of this code was confused about the constructor situation for TypeInfo and used a more complex variant than was required. It illustrates exactly why this cleanup is useful.


================
Comment at: clang-tools-extra/clang-doc/Representation.h:219
 
-  int LineNumber;               // Line number of this Location.
+  int LineNumber = 0;           // Line number of this Location.
   SmallString<32> Filename;     // File for this Location.
----------------
haowei wrote:
> This is already set in the constructor. Does it still need to be set explicitly here? 
It does not, but I have learned to always supply default initializers for POD types when there's a trivial initial value. It's easy for somebody to later come by and add a constructor in such a way that the value is not uninitialized (you seem to be arguing above that you would prefer some of the old constructors, and making those additions could easily cause this problem). This has no runtime overhead and provides some insurance against hard-to-diagnose uninitialized data bugs in the future.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134235/new/

https://reviews.llvm.org/D134235



More information about the cfe-commits mailing list