[PATCH] Devirtualize llvm::Value and all subclasses

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Jun 19 15:53:55 PDT 2015


> On 2015-Jun-16, at 16:46, Pete Cooper <peter_cooper at apple.com> wrote:
> 
> Hi Nick
> 
> As you were the last person to attempt this on PR889, I thought i’d see how you like this series of patches.
> 
> I’ve tried to devirtualize a small number of virtual methods at a time, hence the bunch of patches.  Unfortunately the last one is pretty huge as it handles all the places we used to call delete on Value’s and now need to call a destroy method to dispatch to the correct subclass.  I’m not sure there’s a great way to split the last one up.
> 
> David, i’ve CCed you specifically to ask about how safe some of the changes here are.  In particular, the way i’ve changed things is:
> - If a subclass is final (and i’ve added final where I can), then I give it its own delete method which just does ‘using User::operator delete’.  Is this reasonable?  Do i need to handle any other kinds of delete like User does?
> - If a subclass is not able to be final, then i’ve added a destroy() method and switch on the value kind to then call ‘delete cast<SubClass>(this)’
> 
> Thanks to Duncan for the tip on std::default_delete, i’ve made use of this on various places which had a std::unique_ptr of say Instruction or Value.  The default deleter for those calls destroy() instead of delete.  Deleter’s aren’t needed for final subclasses because I was able to give them a delete method instead.
> 
> I also extended helpers like DeleteContainerPointers to take a std::default_delete so that we can forward to the destroy methods where needed.
> 
> I measured memory usage on the verify_uselist_order test case Duncan has been using.  This contains debug info and is measured with 'llc ~/verify-uselistorder.lto.opt.bc -o /dev/null --filetype=obj -disable-verify’.  Prior to this change we consumed 814MB, and 800MB after.  So almost 2% savings.  This is in line with what I was expecting as the savings is 8 bytes per Value, but due to malloc alignment on Mac OS, we only save that on an average of half of values.
> 
> Finally, this is the order I did all the work locally.  If any of this makes sense to be squashed or split, please let me know.  In the first patch for example I create Constant.def, but I later change that to Value.def.  This one should obviously be fixed to be Value.def from the start.
> 
> Cheers,
> Pete
> 
> <0001-Create-Constant.def-helper.patch>

> From 34e09566150911ea897221d24e04be9236b42b3c Mon Sep 17 00:00:00 2001
> From: Peter Cooper <peter_cooper at apple.com>
> Date: Mon, 15 Jun 2015 12:55:15 -0700
> Subject: [PATCH 01/10] Create Constant.def helper
> 
> ---
>  include/llvm/IR/Constant.def | 48 ++++++++++++++++++++++++++++++++++++++++++++
>  include/llvm/IR/Value.h      | 22 +++++---------------
>  2 files changed, 53 insertions(+), 17 deletions(-)
>  create mode 100644 include/llvm/IR/Constant.def
> 
> diff --git a/include/llvm/IR/Constant.def b/include/llvm/IR/Constant.def
> new file mode 100644
> index 0000000..336d9df
> --- /dev/null
> +++ b/include/llvm/IR/Constant.def
> @@ -0,0 +1,48 @@
> +//===-------- llvm/Constant.def - File that describes Constants -*- C++ -*-===//

Wrong file name here.

> +// 
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +// 
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains descriptions of the various LLVM constants.  This is
> +// used as a central place for enumerating the different constants.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +// NOTE: NO INCLUDE GUARD DESIRED!
> +

Please add an idiot guard (with #error checks) against defining the
wrong macro.  I.e.:

    #define HANDEL_CONSTANT(NAME) doSomethingWith ## NAME();
    #include "llvm/IR/Constant.def"

should give an error.  (You can copy/paste/hack from Metadata.def.)

> +// Provide definitions of macros so that users of this file do not have to 
> +// define everything to use it...
> +//
> +#ifndef HANDLE_CONSTANT
> +#define HANDLE_CONSTANT(a)

A good parameter name would double as documentation for how to write the
macros.

> +#endif
> +
> +#ifndef HANDLE_CONSTANT_MARKER
> +#define HANDLE_CONSTANT_MARKER(a, b)

Same here.

> +#endif
> +
> +HANDLE_CONSTANT  ( Function )

I don't think it's worth maintaining special spacing (and since a space
between a macro and the `(` can change how it's evaluated, it's strange
to have it when you don't care about the difference).  I'd just
clang-format this and be done with it.

> +HANDLE_CONSTANT  ( GlobalAlias )
> +HANDLE_CONSTANT  ( GlobalVariable )
> +HANDLE_CONSTANT  ( UndefValue )
> +HANDLE_CONSTANT  ( BlockAddress )
> +HANDLE_CONSTANT  ( ConstantExpr )
> +HANDLE_CONSTANT  ( ConstantAggregateZero )
> +HANDLE_CONSTANT  ( ConstantDataArray )
> +HANDLE_CONSTANT  ( ConstantDataVector )
> +HANDLE_CONSTANT  ( ConstantInt )
> +HANDLE_CONSTANT  ( ConstantFP )
> +HANDLE_CONSTANT  ( ConstantArray )
> +HANDLE_CONSTANT  ( ConstantStruct )
> +HANDLE_CONSTANT  ( ConstantVector )
> +HANDLE_CONSTANT  ( ConstantPointerNull )
> +
> +HANDLE_CONSTANT_MARKER( ConstantFirstVal, Function )
> +HANDLE_CONSTANT_MARKER( ConstantLastVal, ConstantPointerNull )

Same here.

> +
> +#undef HANDLE_CONSTANT
> +#undef HANDLE_CONSTANT_MARKER
> diff --git a/include/llvm/IR/Value.h b/include/llvm/IR/Value.h
> index b175e0e..fc266f9 100644
> --- a/include/llvm/IR/Value.h
> +++ b/include/llvm/IR/Value.h
> @@ -335,21 +335,9 @@ public:
>    enum ValueTy {
>      ArgumentVal,              // This is an instance of Argument
>      BasicBlockVal,            // This is an instance of BasicBlock

Slightly strange not to include these two as well and make it
`Value.def`.  Any reason why not?

> -    FunctionVal,              // This is an instance of Function
> -    GlobalAliasVal,           // This is an instance of GlobalAlias
> -    GlobalVariableVal,        // This is an instance of GlobalVariable
> -    UndefValueVal,            // This is an instance of UndefValue
> -    BlockAddressVal,          // This is an instance of BlockAddress
> -    ConstantExprVal,          // This is an instance of ConstantExpr
> -    ConstantAggregateZeroVal, // This is an instance of ConstantAggregateZero
> -    ConstantDataArrayVal,     // This is an instance of ConstantDataArray
> -    ConstantDataVectorVal,    // This is an instance of ConstantDataVector
> -    ConstantIntVal,           // This is an instance of ConstantInt
> -    ConstantFPVal,            // This is an instance of ConstantFP
> -    ConstantArrayVal,         // This is an instance of ConstantArray
> -    ConstantStructVal,        // This is an instance of ConstantStruct
> -    ConstantVectorVal,        // This is an instance of ConstantVector
> -    ConstantPointerNullVal,   // This is an instance of ConstantPointerNull
> +
> +#define HANDLE_CONSTANT(Name) Name##Val,
> +#include "llvm/IR/Constant.def"
>      MetadataAsValueVal,       // This is an instance of MetadataAsValue
>      InlineAsmVal,             // This is an instance of InlineAsm
>      InstructionVal,           // This is an instance of Instruction

Same for these three.

> @@ -357,8 +345,8 @@ public:
>      // don't add new values here!
>  
>      // Markers:
> -    ConstantFirstVal = FunctionVal,
> -    ConstantLastVal  = ConstantPointerNullVal
> +#define HANDLE_CONSTANT_MARKER(Marker, Constant) Marker = Constant##Val,
> +#include "llvm/IR/Constant.def"
>    };
>  
>    /// \brief Return an ID for the concrete type of this object.
> -- 
> 2.3.4 (Apple Git-56)
> 
> 


> <0002-Devirt-Constant-destroyConstant.patch>

> From 86f9c1411a24efb2c6c805af0612417fe363236b Mon Sep 17 00:00:00 2001
> From: Peter Cooper <peter_cooper at apple.com>
> Date: Mon, 15 Jun 2015 13:04:29 -0700
> Subject: [PATCH 02/10] Devirt Constant::destroyConstant
> 
> ---
>  include/llvm/IR/Constant.h    |  2 +-
>  include/llvm/IR/Constants.h   | 18 +++++++++---------
>  include/llvm/IR/GlobalValue.h |  2 +-
>  lib/IR/Constants.cpp          | 11 +++++++++++
>  4 files changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/include/llvm/IR/Constant.h b/include/llvm/IR/Constant.h
> index 75499e0..b35d16c 100644
> --- a/include/llvm/IR/Constant.h
> +++ b/include/llvm/IR/Constant.h
> @@ -133,7 +133,7 @@ public:
>    /// available cached constants.  Implementations should call
>    /// destroyConstantImpl as the last thing they do, to destroy all users and
>    /// delete this.
> -  virtual void destroyConstant() { llvm_unreachable("Not reached!"); }
> +  void destroyConstant();
>  
>    //// Methods for support type inquiry through isa, cast, and dyn_cast:
>    static inline bool classof(const Value *V) {
> diff --git a/include/llvm/IR/GlobalValue.h b/include/llvm/IR/GlobalValue.h
> index 21471c7..2101ca4 100644
> --- a/include/llvm/IR/GlobalValue.h
> +++ b/include/llvm/IR/GlobalValue.h
> @@ -335,7 +335,7 @@ public:
>  /// @}
>  
>    /// Override from Constant class.
> -  void destroyConstant() override;
> +  void destroyConstant();
>  
>    /// Return true if the primary definition of this global value is outside of
>    /// the current translation unit.
> diff --git a/lib/IR/Constants.cpp b/lib/IR/Constants.cpp
> index 76c55b6..ae1ca1b 100644
> --- a/lib/IR/Constants.cpp
> +++ b/lib/IR/Constants.cpp
> @@ -1432,6 +1432,17 @@ const APInt &Constant::getUniqueInteger() const {
>    return cast<ConstantInt>(C)->getValue();
>  }
>  
> +void Constant::destroyConstant() {
> +  switch (getValueID()) {
> +    default:
> +      llvm_unreachable("Not a constant!");
> +#define HANDLE_CONSTANT(Name) \
> +    case Value::Name##Val: \

clang-format?

> +      return cast<Name>(this)->destroyConstant();

I think this is a dangerous pattern.  If someone forgets to implement
this in a subclass, there won't be a compile error, and unlike the
`llvm_unreachable()` we had before we get an infinite recursion.

I've tended to choose a different name for the subclass method,
essentially using this noisier pattern:

    class Base {
    public:
      void foo();
    };

    class Derived : public Base{
      friend class Base; // For fooImpl().
      void fooImpl();

    public:
      // This avoids the indirection through Base::foo().
      void foo() { fooImpl(); }
    };

    void Base::foo() {
      // dispatch to subclass.
      switch (...) {
      default:
        llvm_unreachable(...);
      // Compile error if there's an unhandled case instead of
      // infinite recursion.
    #define HANDLE_DERIVED(NAME)                        \
      case NAME ## Kind:                                \
        return cast<NAME>(this)->fooImpl();
    }

However, I realize there's already something called
`Constant::destroyConstantImpl()`, which has a `delete this` inside it
(ironic, given that the context for this patch is removing the vtable
that the `delete` call relies on).

I haven't looked at all the patches in this series yet, but I feel like
there ought to be some way of clarifying `Constant` destruction
immediately.  My shot-from-the-hip is something like the following:

    class Constant {
    public:
      void destroyConstant();
    };

    class SomeConstant : public Constant {
      friend class Base; // For fooImpl().

      /// Destroy and delete the constant.
      void destroyConstantImpl();
      ~SomeConstant();

      // Don't provide destroyConstant().
    };

    void Constant::destroyConstant() {
      // Remove lingering references from the constant pool (move from
      // old `Constant::destroyConstantImpl()`).
      while (!use_empty()) {
        // ...
      }

      // Dispatch to subclass to cleanup and delete.
      switch (...) {
      default:
        llvm_unreachable(...);
      // Compile error if there's an unhandled case instead of
      // infinite recursion.
    #define HANDLE_CONSTANT(NAME)                       \
      case NAME ## Kind:                                \
        cast<NAME>(this)->destroyConstantImpl();        \
        break;
      }

      // When we drop virtual dispatch for the destructor, move the
      // delete call inside the switch statement above.
      delete this;
    }

    void SomeConstant::destroyConstantImpl() {
      assert(use_empty() && ...);
      getContext()->SomeConstantPool.erase(this);
    }

This inverts the destroyConstant/Impl relationship.

Maybe this leaves out some case, or doesn't quite fit with the end goal
(you've thought about this more than I have).  My main point is, with
static dispatch we can easily catch the "missing destroyConstant()
implementation" at compile-time, and we should.

> +#include "llvm/IR/Constant.def"
> +  }
> +}
> +
>  
>  //---- ConstantPointerNull::get() implementation.
>  //
> -- 
> 2.3.4 (Apple Git-56)
> 
> 

> <0003-Devirt-Constant-replaceUsesOfWithOnConstant.patch><0004-Devirtualize-Instruction-clone_impl.patch><0005-Devirtualize-getSuccessorV-getNumSuccessorsV-and-set.patch><0006-Don-t-cast-Constant-to-Value-as-we-will-soon-have-no.patch><0007-Devirt-GlobalValue-removeFromParent-and-eraseFromPar.patch><0008-Devirt-GlobalValue-copyAttributesFrom.patch><0009-Harden-the-types-of-copyGVAttributes.patch><0010-Initial-patch-to-remove-the-rest-of-virtual-which-me.patch>

(Haven't looked at these yet.)



More information about the llvm-commits mailing list