[llvm-dev] RFC: Strong typedef for LLVM

Thu Aug 1 08:57:40 PDT 2019

Lately I've been using some utilities to increase the number of logic
errors caught at compile time.  I thought they might be useful to the
LLVM project.  I'd appreciate feedback on the below proposal.  Would the
community find these useful?

                          -David

RFC: Strong typedef utilities for LLVM
--------------------------------------

Abstract
--------
This proposal describes a set of utility classes for increasing type
safety within the LLVM project codebase.  It introduces a general
"strong typedef" template along with more specialized templates for
Boolean values, accumulators and counters.

The interface and implementation are inspired by Jonathan Mueller's
post on strong typedef [1].

The Problem
-----------
As noted on a side-thread of the variable naming convention change,
LLVM has a number of places where function parameter semantics are
unclear at the point of use [2].  To compensate, much code exists
in this style:

  foo(arg1, /* SomeFlagName = */true);

Functions with, for example, two boolean parameters are often called
as:

  foo(arg1, /* Flag1Name = */true, /* Flag2Name =*/false);

Unfortunately, it is easy to mix this up unintentionally:

  foo(arg1, /* Flag1Name = */false, /* Flag2Name =*/true);

The above may or may not be a bug.  The intent is unclear from the
context.

Things are even worse without comments:

  foo(arg1, true, false);  // What does this mean?

  bool SomeFlag = true;
  bool AnotherFlag = false;

  foo(arg1, AnotherFlag, SomeFlag);  // Is this correct?

  bool DoThing = bar();
  bool EnableFeature = baz();

  foo(arg1, EnableFeature, DoThing);  // Still not clear.

  bar(size, alignment);  // Did I get that in the right order?
                         // Is size in bits or bytes?

A Possible Solution
-------------------
It would be nice if the Flag1Name and Flag2Name comments had semantic
meaning, such that they could be checked for correctness at compile
time.  It would be even better if such semantic information was
*required* at the point of use.  One could construct special classes
to make it so:

  class Flag1Value {
    bool Value;

  public:
    Flag1Value : Value(false) {}
    explicit Flag1Value(bool V) : Value(V) {}

    explicit operator bool() { return Value; }
  };

  class Flag2Value {
    bool Value;

  public:
    Flag2Value : Value(false) {}
    explicit Flag2Value(bool V) : Value(V) {}

    explicit operator bool() { return Value; }
  };

  void foo(int arg, Flag1Value F1, Flag2Value F2);

  foo(arg1, Flag1Value(true), Flag2Value(false));
  foo(arg1, Flag2Value(true), Flag1Value(false));  // Won't compile.

Of course, defining individual classes for every kind of flag does not
scale.  However, a template class can make it workable:

  template<typename Tag, typename T>
  class StrongTypedef {
  public:
    using BaseType = T;

  private:
    BaseType Value;

  public:
    constexpr StrongTypedef() : Value() {}
    constexpr StrongTypedef(const StrongTypedef<Tag, T> &) = default;
    explicit constexpr StrongTypedef(const BaseType &V) : Value(V) {}
    explicit StrongTypedef(BaseType &&V)
        noexcept(std::is_nothrow_move_constructible<BaseType>::value) :
        Value(std::move(V)) {}

  public:
    explicit operator BaseType&() noexcept {
      return Value;
    }

    explicit operator const BaseType&() const noexcept {
      return Value;
    }

    friend void swap(StrongTypedef &A, StrongTypedef &B) noexcept {
      using std::swap;
      swap(static_cast<BaseType&>(A), static_cast<BaseType&>(B));
    }
  };

  class Flag1Value : public StrongTypedef<Flag1Value, bool> {
  public:
    using StrongTypedef::StrongTypedef;
  };

  class Flag2Value : public StrongTypedef<Flag2Value, bool> {
  public:
    using StrongTypedef::StrongTypedef;
  };

Note that Tag does not have to be a derived class and StrongTypedef
does not have to be used in a CRTP manner, though it is often
convenient to do so.

The StrongTypedef template doesn't provide any operations outside of
explicit conversion to/from the base type.  Mixin classes can imbue it
with new interfaces:

  // Mixin to add equality and inequality comparison.
  template<typename ST>
  struct HasEqualityCompare {
    friend bool operator==(const ST &LHS, const ST &RHS) {
      using BaseType = typename ST::BaseType;

      return (static_cast<const BaseType &>(LHS) ==
              static_cast<const BaseType &>(RHS));
    }

    friend bool operator!=(const ST &LHS, const ST &RHS) {
      return !(LHS == RHS);
    }
  };

  class Flag1Value : public StrongTypedef<Flag1Value, bool>,
                     public HasEqualityCompare<Flag1Value> {
  public:
    using StrongTypedef::StrongTypedef;
  };

  Flag1Value True(true);
  Flag1Value False(false);

  assert(True != False);
  assert(True == True);

StrongTypedef can be used to create rudimentary units types:

  class Bits : StrongTypedef<Bits, int64_t> {
  public:
    using StrongTypedef::StrongTypedef;
  };

  class Bytes : StrongTypedef<Bytes, int64_t> {
  public:
    using StrongTypedef::StrongTypedef;
  };

  Bits size(32);
  Bytes alignment(4);

  void bar(Bytes size, Bytes alignment);

  bar(size, alignment);  // Won't compile

Boolean parameters are so common that it's convenient to have a
special helper for them:

  template<typename Tag>
  class NamedBoolean : public StrongTypedef<Tag, bool>,
                       public HasEqualityCompare<NamedBoolean<Tag>>,
                       public HasLogicalConjunction<NamedBoolean<Tag>>,
                       public HasLogicalDisjunction<NamedBoolean<Tag>> {
  public:
    using StrongTypedef<Tag, bool>::StrongTypedef;
  };

  class Flag1Value : public NamedBoolean<Flag1Value> {
  public:
    using NamedBoolean::NamedBoolean;
  };

It can also be useful to have utilities for strong typedefed counters
and accumulators:

  template<typename Tag, typename Int = int64_t>
  class NamedCounter : public StrongTypedef<Tag, Int>,
                       public HasEqualityCompare<NamedCounter<Tag>>,
                       public HasIncrement<NamedCounter<Tag>> {  // ++
  public:
    using StrongTypedef<Tag, int64_t>::StrongTypedef;
  };

  template<typename Tag, typename Int = int64_t>
  class NamedAccumulator : public StrongTypedef<Tag, Int>,
                           public HasEqualityCompare<NamedAccumulator<Tag>>,
                           public HasAccumulate<NamedAccumulator<Tag>> {  // +=
  public:
    using StrongTypedef<Tag, int64_t>::StrongTypedef;
  };

Proposal
--------
This RFC proposes nothing more than introduction of these templates
under include/llvm/ADT to make them available to developers who want
extra safety in their code.  It specifically does not propose rewriting
existing uses of flags, counters, accumulators or any other values to
use these utilities, nor does it mandate their use in new code.  Such
changes may be desirable but should happen incrementally over time.

References
----------
[1] https://foonathan.net/blog/2016/10/19/strong-typedefs.html
[2] http://llvm.1065342.n5.nabble.com/llvm-dev-RFC-changing-variable-naming-rules-in-LLVM-code