[RFC] Remove User::OperandList

Duncan P. N. Exon Smith dexonsmith at apple.com
Thu Jun 11 11:02:44 PDT 2015


> On 2015-Jun-10, at 15:44, Pete Cooper <peter_cooper at apple.com> wrote:
> 
> Just pushed 0001-0005 as r239489-r239493.
> 
> Cheers,
> Pete
> 
>> On Jun 10, 2015, at 1:23 PM, Pete Cooper <peter_cooper at apple.com> wrote:
>> 
>> Hi Duncan
>> 
>> Here’s the updated patches.  I didn’t make any changes outside of your feedback to 0001-0005 so they can probably be ignored.  I’m working on using clang-format now then will commit those ones.
>> 
>> I reorganised 0006-0009 to try make them ordered better.
>> 
>> 0006 and 0007 are now just refactoring num operands and get/set operand list.  I added the assert for out of range num operands to 0006.
>> 
>> 0008 was 0006 before.  It adds the new version of operator new, but this time i’ve cleaned it up more as the num uses was 0.
>> 0009 is now last as it was the real change in behavior for this whole series.
>> 
>> Cheers,
>> Pete
>> 
>> <0001-Move-the-special-Phi-logic-for-hung-off-uses-in-to-U.patch>
>> <0002-Make-User-track-whether-a-class-has-hung-off-uses-an.patch>
>> <0003-Delete-User-dropHungOffUses-and-move-it-in-to-User-w.patch>
>> <0004-Add-User-growHungoffUses-and-use-it-to-grow-the-hung.patch>
>> <0005-Stop-returning-a-Use-from-allocHungOffUses.patch>
>> <0006-Replace-all-accesses-to-User-OperandList-with-getter.patch>

LGTM, plus:

> diff --git a/include/llvm/IR/User.h b/include/llvm/IR/User.h
> index 8ec11e5..d5d0187 100644
> --- a/include/llvm/IR/User.h
> +++ b/include/llvm/IR/User.h
> @@ -125,10 +135,14 @@ public:
>    typedef iterator_range<op_iterator> op_range;
>    typedef iterator_range<const_op_iterator> const_op_range;
>  
> -  inline op_iterator       op_begin()       { return OperandList; }
> -  inline const_op_iterator op_begin() const { return OperandList; }
> -  inline op_iterator       op_end()         { return OperandList+NumOperands; }
> -  inline const_op_iterator op_end()   const { return OperandList+NumOperands; }
> +  inline op_iterator       op_begin()       { return getOperandList(); }
> +  inline const_op_iterator op_begin() const { return getOperandList(); }
> +  inline op_iterator       op_end()         {
> +    return getOperandList() + NumOperands;
> +  }
> +  inline const_op_iterator op_end()   const {
> +    return getOperandList() + NumOperands;
> +  }

All these `inline` keywords are redundant and noisy.  Maybe you can
clean them up while you're in there?  (Maybe as a follow-up or prep
commit?)

>> <0007-Rename-NumOperands-to-make-it-clear-its-managed-by-t.patch>

> From 30380812e02221e7693530ebf1a01d527fdcccc0 Mon Sep 17 00:00:00 2001
> From: Peter Cooper <peter_cooper at apple.com>
> Date: Tue, 26 May 2015 14:16:41 -0700
> Subject: [PATCH 7/9] Rename NumOperands to make it clear its managed by the
>  User. NFC.
> 
> This is to try make it very clear that subclasses shouldn't be changing
> the value directly.  Now that OperandList for normal instructions is computed
> using the NumOperands, its critical that the NumOperands is accurate or we
> could compute the wrong offset to the first operand.
> 
> I looked over all places which update NumOperands and they are all safe.
> Hung off use User's don't use NumOperands to compute the OperandList so they
> are safe to continue to manipulate it.  The only other User which changed it
> was GlobalVariable which has an optional init list but always allocated space
> for a single Use.  It was correctly setting NumOperands to 1 before setting an
> initializer, and setting it to 0 after clearing the init list, so the order was safe.
> 
> Added some comments to that code to make sure that this isn't changed in future
> without being aware of this constraint.
> ---
>  include/llvm/IR/GlobalVariable.h |  3 ++-
>  include/llvm/IR/Instructions.h   |  8 ++++----
>  include/llvm/IR/User.h           | 38 ++++++++++++++++++++++++++++----------
>  include/llvm/IR/Value.h          |  7 ++++++-
>  lib/IR/Globals.cpp               | 10 ++++++++--
>  lib/IR/Instructions.cpp          | 39 ++++++++++++++++++++-------------------
>  lib/IR/User.cpp                  |  5 +++--
>  lib/IR/Value.cpp                 |  2 +-
>  8 files changed, 72 insertions(+), 40 deletions(-)
> 
> diff --git a/include/llvm/IR/User.h b/include/llvm/IR/User.h
> index d5d0187..593b46c 100644
> --- a/include/llvm/IR/User.h
> +++ b/include/llvm/IR/User.h
> @@ -53,7 +53,8 @@ protected:
>    User(Type *ty, unsigned vty, Use *OpList, unsigned NumOps)
>        : Value(ty, vty) {
>      setOperandList(OpList);
> -    NumOperands = NumOps;
> +    assert(NumOps < (1 << User::NumUserOperandsBits) && "Too many operands");

This should probably be `1u << NumUserOperandBits`:

  - Makes it more obvious that this isn't undefined behaviour (1 << 31 is
    undefined).
  - I don't think we need the `User::` scope.

> +    NumUserOperands = NumOps;
>    }
>  
>    /// \brief Allocate the array of Uses, followed by a pointer
> @@ -68,11 +69,12 @@ protected:
>  public:
>    ~User() override {
>      // drop the hung off uses.
> -    Use::zap(getOperandList(), getOperandList() + NumOperands, HasHungOffUses);
> +    Use::zap(getOperandList(), getOperandList() + NumUserOperands,
> +             HasHungOffUses);
>      if (HasHungOffUses) {
>        setOperandList(nullptr);
>        // Reset NumOperands so User::operator delete() does the right thing.
> -      NumOperands = 0;
> +      NumUserOperands = 0;
>      }
>    }
>    /// \brief Free memory allocated for User and Use objects.
> @@ -106,26 +108,42 @@ public:
>      return LegacyOperandList;
>    }
>    Value *getOperand(unsigned i) const {
> -    assert(i < NumOperands && "getOperand() out of range!");
> +    assert(i < NumUserOperands && "getOperand() out of range!");
>      return getOperandList()[i];
>    }
>    void setOperand(unsigned i, Value *Val) {
> -    assert(i < NumOperands && "setOperand() out of range!");
> +    assert(i < NumUserOperands && "setOperand() out of range!");
>      assert((!isa<Constant>((const Value*)this) ||
>              isa<GlobalValue>((const Value*)this)) &&
>             "Cannot mutate a constant with setOperand!");
>      getOperandList()[i] = Val;
>    }
>    const Use &getOperandUse(unsigned i) const {
> -    assert(i < NumOperands && "getOperandUse() out of range!");
> +    assert(i < NumUserOperands && "getOperandUse() out of range!");
>      return getOperandList()[i];
>    }
>    Use &getOperandUse(unsigned i) {
> -    assert(i < NumOperands && "getOperandUse() out of range!");
> +    assert(i < NumUserOperands && "getOperandUse() out of range!");
>      return getOperandList()[i];
>    }
>  
> -  unsigned getNumOperands() const { return NumOperands; }
> +  unsigned getNumOperands() const { return NumUserOperands; }
> +
> +  /// \brief HACK: Set the number of operands on a GlobalVariable.  This class
> +  /// always allocates space for a single operand, but doesn't always use it.

Skip the `\brief` keyword, since we're using auto-brief these days.
Also, add a blank doxygen line between the brief comment and the longer
description.

HACK isn't really the right way to call out problems.  I think something
like this is more clear:

    /// Set the number of operands on a GlobalVariable.
    ///
    /// GlobalVariable always allocates space for a single operands, but
    /// doesn't always use it.
    ///
    /// FIXME: <text describing the problem or how to fix it>

On that subject, do you have a plan for how to fix this?  If not, is it
really a problem?

> +  void setGlobalVariableNumOperands(unsigned NumOps) {
> +    assert(NumOps <= 1 && "GlobalVariable can only have 0 or 1 operands");
> +    NumUserOperands = NumOps;
> +  }
> +
> +  /// \brief Subclasses with hung off uses need to manage the operand count
> +  /// themselves.  In these instances, the operand count isn't used to find the
> +  /// OperandList, so there's no issue in having the operand count change.
> +  void setNumHungOffUseOperands(unsigned NumOps) {
> +    assert(HasHungOffUses && "Must have hung off uses to use this method");
> +    assert(NumOps < (1 << NumUserOperandsBits) && "Too many operands");
> +    NumUserOperands = NumOps;
> +  }
>  
>    // ---------------------------------------------------------------------------
>    // Operand Iterator interface...
> @@ -138,10 +156,10 @@ public:
>    inline op_iterator       op_begin()       { return getOperandList(); }
>    inline const_op_iterator op_begin() const { return getOperandList(); }
>    inline op_iterator       op_end()         {
> -    return getOperandList() + NumOperands;
> +    return getOperandList() + NumUserOperands;
>    }
>    inline const_op_iterator op_end()   const {
> -    return getOperandList() + NumOperands;
> +    return getOperandList() + NumUserOperands;
>    }
>    inline op_range operands() {
>      return op_range(op_begin(), op_end());
> diff --git a/include/llvm/IR/Value.h b/include/llvm/IR/Value.h
> index c9c1301..79e9b306 100644
> --- a/include/llvm/IR/Value.h
> +++ b/include/llvm/IR/Value.h
> @@ -101,7 +101,12 @@ protected:
>    /// This is stored here to save space in User on 64-bit hosts.  Since most
>    /// instances of Value have operands, 32-bit hosts aren't significantly
>    /// affected.
> -  unsigned NumOperands : 29;
> +  ///
> +  /// Note, this should *NOT* be used directly by any class other than User.
> +  /// User uses this value to find the Use list.
> +  static const unsigned NumUserOperandsBits = 29;
> +  unsigned NumUserOperands : 29;

Can you use `NumUserOperandsBits` here?

Also, IIRC, C++03 would have required:

    const unsigned User::NumUserOperands;

to prevent linker errors in -O0 builds (at least on some compilers).

Did C++11 drop this requirement?

> +

Extra newline?

>  
>    bool IsUsedByMD : 1;
>    bool HasName : 1;


>> <0008-Added-a-version-of-User-new-for-hung-off-uses.patch>

LGTM with nitpicks (although I assume it's blocked on 0007).

> From 59dd6726a660eb636b7f6cc4fe5e811148a17ca6 Mon Sep 17 00:00:00 2001
> From: Peter Cooper <peter_cooper at apple.com>
> Date: Tue, 26 May 2015 11:11:52 -0700
> Subject: [PATCH 8/9] Added a version of User::new for hung off uses.
> 
> There are now 2 versions of User::new.  The first takes a size_t and is the current
> implementation for subclasses which need 0 or more Use's allocated for their operands.
> 
> The new version takes no extra arguments to say that this subclass needs 'hung off uses'.
> The HungOffUses bool is now set in this version of User::new and we can assert in
> allocHungOffUses that we are allowed to have hung off uses.
> This ensures we call the correct version of User::new for subclasses which need hung off uses.
> 
> A future commit will then allocate space for a single Use* which will be used
> in place of User::OperandList once that field has been removed.
> ---
>  include/llvm/IR/Instructions.h |  8 ++++----
>  include/llvm/IR/User.h         |  8 +++++++-
>  lib/IR/User.cpp                | 12 ++++++++++--
>  3 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/include/llvm/IR/User.h b/include/llvm/IR/User.h
> index 593b46c..32930a5 100644
> --- a/include/llvm/IR/User.h
> +++ b/include/llvm/IR/User.h
> @@ -34,7 +34,6 @@ struct OperandTraits;
>  
>  class User : public Value {
>    User(const User &) = delete;
> -  void *operator new(size_t) = delete;
>    template <unsigned>
>    friend struct HungoffOperandTraits;
>    virtual void anchor();
> @@ -48,6 +47,13 @@ protected:
>    Use *LegacyOperandList;
>  
>  protected:
> +  /// \brief Allocate a User with an operand pointer co-allocated.
> +  /// This is used for subclasses which need to allocate a variable number
> +  /// of operands, ie, 'hung off uses'.

Current style guidelines don't use `\brief`, since we're using
auto-brief.  Please remove it.

Also, a newline between the brief description and the longer one is
better, something like:

    /// Allocate a User with an operand pointer co-allocated.
    ///
    /// Allocate a User for subclasses which need to allocate a variable
    /// number of operands, ie, 'hung off uses'.

> +  void *operator new(size_t);

Please add a name for the parameter (I suggest `Size`).

> +
> +  /// \brief Allocate a User with the operands co-allocated.
> +  /// This is used for subclasses which have a fixed number of operands.
>    void *operator new(size_t s, unsigned Us);
>  
>    User(Type *ty, unsigned vty, Use *OpList, unsigned NumOps)
> diff --git a/lib/IR/User.cpp b/lib/IR/User.cpp
> index 7d43fa8..c755f77 100644
> --- a/lib/IR/User.cpp
> +++ b/lib/IR/User.cpp
> @@ -41,6 +41,7 @@ void User::replaceUsesOfWith(Value *From, Value *To) {
>  //===----------------------------------------------------------------------===//
>  
>  void User::allocHungoffUses(unsigned N, bool IsPhi) {
> +  assert(HasHungOffUses && "alloc must have hung off uses");
>    // Allocate the array of Uses, followed by a pointer (with bottom bit set) to
>    // the User.
>    size_t size = N * sizeof(Use) + sizeof(Use::UserRef);
> @@ -50,8 +51,6 @@ void User::allocHungoffUses(unsigned N, bool IsPhi) {
>    Use *End = Begin + N;
>    (void) new(End) Use::UserRef(const_cast<User*>(this), 1);
>    setOperandList(Use::initTags(Begin, End));
> -  // Tag this operand list as being a hung off.
> -  HasHungOffUses = true;
>  }
>  
>  void User::growHungoffUses(unsigned NewNumUses, bool IsPhi) {
> @@ -98,6 +97,15 @@ void *User::operator new(size_t s, unsigned Us) {
>    return Obj;
>  }
>  
> +void *User::operator new(size_t s) {

Make this parameter name match what you add in the decl.  Also, follow
the style guidelines for the name.  (I know this is copy/paste, but you
might as well make the new code better.)

>> <0009-Move-OperandList-to-be-allocated-prior-to-User-for-h.patch>

> From 953bd2b5d1d36da12ed7ed64705932f434b178ab Mon Sep 17 00:00:00 2001
> From: Peter Cooper <peter_cooper at apple.com>
> Date: Tue, 26 May 2015 12:55:05 -0700
> Subject: [PATCH 9/9] Move OperandList to be allocated prior to User for hung
>  off subclasses.
> 
> For hung off uses, we need a Use* to tell use where the operands are.
> This was User::OperandList but we want to remove that to save space
> of all subclasses which aren't making use of 'hung off uses'.
> 
> Hung off uses now allocate their own 'OperandList' Use* in the
> User::new which they call.
> 
> getOperandList() now uses the hung off uses bit to work out where the
> Use* for the OperandList lives.  If a User has hung off uses, then this
> bit tells them to go back a single Use* from the User* and use that
> value as the OperandList.
> 
> If a User has no hung off uses, then we get the first operand by
> subtracting (NumOperands * sizeof(Use)) from the User this pointer.
> 
> This saves a pointer from User and all subclasses.  Given the average
> size of a subclass of User is 112 or 128 bytes, this saves around 7% of space
> With malloc tending to align to 16-bytes the real saving is typically more like 3.5%.
> 
> On 'opt -O2 verify-uselistorder.lto.bc', peak memory usage prior to this change
> is 149MB and after is 143MB so the savings are around 2.5% of peak.
> 
> Looking at some passes which allocate many Instructions and Values, parseIR drops
> from 54.25MB to 52.21MB while the Inliner calls to Instruction::clone() drops
> from 28.20MB to 27.05MB.
> ---
>  include/llvm/IR/User.h | 36 +++++++++++++++++++-----------------
>  lib/IR/User.cpp        | 27 ++++++++++++++++++---------
>  2 files changed, 37 insertions(+), 26 deletions(-)
> 
> diff --git a/include/llvm/IR/User.h b/include/llvm/IR/User.h
> index 32930a5..706ae91 100644
> --- a/include/llvm/IR/User.h
> +++ b/include/llvm/IR/User.h
> @@ -77,11 +68,6 @@ public:
>      // drop the hung off uses.
>      Use::zap(getOperandList(), getOperandList() + NumUserOperands,
>               HasHungOffUses);
> -    if (HasHungOffUses) {
> -      setOperandList(nullptr);
> -      // Reset NumOperands so User::operator delete() does the right thing.
> -      NumUserOperands = 0;
> -    }

Can/should we move this logic to `User::operator delete`?  Maybe in a
follow-up?  More below.

>    }
>    /// \brief Free memory allocated for User and Use objects.
>    void operator delete(void *Usr);
> @@ -107,11 +93,27 @@ protected:
>    }
>  private:
>    void setOperandList(Use *NewList) {
> -    LegacyOperandList = NewList;
> +    assert(HasHungOffUses &&
> +           "Setting operand list only required for hung off uses");
> +    Use **Storage = reinterpret_cast<Use**>(this) - 1;

clang-format?

> +    *Storage = NewList;

I'm not sure `Storage` adds value here.  Why not:

    *(reinterpret_cast<Use**>(this) - 1) = NewList;

?

Even better, a separate function would make this clear:

    Use *&getHungOffOperands() { return reinterpret_cast<Use **>(this) - 1; }
    void setOperandList(Use *NewList) {
      assert(...);
      getHungOffOperands() = NewList;
    }

>    }
>  public:
> -  Use *getOperandList() const {
> -    return LegacyOperandList;
> +  const Use *getOperandList() const {
> +    if (HasHungOffUses) {
> +      Use * const*Storage = reinterpret_cast<Use* const*>(this) - 1;

Clang format?

> +      return *Storage;

I don't think `Storage` adds value here either.  I'd just:

    return *const_cast<User *>(this)->getHungOffOperands();

or add two versions of `getHungOffOperands()` and skip the `const_cast`.

> +    }
> +    const Use *Storage = reinterpret_cast<const Use*>(this) - NumUserOperands;
> +    return Storage;

Similarly, adding a helper function here would simplify, and you don't
need the `Storage` variable.

> +  }
> +  Use *getOperandList() {
> +    if (HasHungOffUses) {
> +      Use * const*Storage = reinterpret_cast<Use* const*>(this) - 1;
> +      return *Storage;
> +    }
> +    Use *Storage = reinterpret_cast<Use*>(this) - NumUserOperands;
> +    return Storage;

Actually, move the above comments down to here.  You can use this to
implement the `const` version without duplicating logic:

    const Use *getOperandList() const {
      return const_cast<User *>(this)->getOperandList();
    }

Hmm... for that matter, why are you adding both?  Previously we had

    Use *getOperandList() const;

You're change this to:

    Use *getOperandList();
    const Use *getOperandList() const;

Is the change in const-ness important for your main change?  If not, I'd
just leave this alone (or possibly do it as a follow-up, if there's a
compelling benefit).

>    }
>    Value *getOperand(unsigned i) const {
>      assert(i < NumUserOperands && "getOperand() out of range!");
> diff --git a/lib/IR/User.cpp b/lib/IR/User.cpp
> index c755f77..cd0c5b6 100644
> --- a/lib/IR/User.cpp
> +++ b/lib/IR/User.cpp
> @@ -90,19 +90,21 @@ void *User::operator new(size_t s, unsigned Us) {
>    Use *Start = static_cast<Use*>(Storage);
>    Use *End = Start + Us;
>    User *Obj = reinterpret_cast<User*>(End);
> -  Obj->setOperandList(Start);
> -  Obj->HasHungOffUses = false;
>    Obj->NumUserOperands = Us;
> +  Obj->HasHungOffUses = false;
>    Use::initTags(Start, End);
>    return Obj;
>  }
>  
>  void *User::operator new(size_t s) {
> -  void *Storage = ::operator new(s);
> -  User *Obj = reinterpret_cast<User*>(Storage);
> -  Obj->setOperandList(nullptr);
> +  // Allocate space for a single Use*
> +  void *Storage = ::operator new(s + sizeof(Use*));
> +  Use **Start = static_cast<Use**>(Storage);

clang-format?

> +  Use **End = Start + 1;
> +  User *Obj = reinterpret_cast<User*>(End);

The whole `Start`/`End` naming scheme was useful above where we had an
array, but not here.  Something like this is clearer IMO:

    void *Storage = ...;
    auto **HungOffOperandList = static_cast<Use **>(Storage);
    User *Obj = reinterpret_cast<User *>(HungOffOperandList + 1);

Or `HungOffOperands`?  Anyway, kill the `Start`/`End` naming.

>    Obj->HasHungOffUses = true;
>    Obj->NumUserOperands = 0;
> +  Obj->setOperandList(nullptr);

This seems awkward, calling functions in `User` before the constructor
has even been called; it smells like undefined behaviour (even if it
isn't).

I'd be more comfortable being more direct, saying:

    *HungOffOperandList = nullptr;

Also, `User::User()` should assert that (in the hung off case) the
operand list has been correctly initialized to `nullptr`.

>    return Obj;
>  }
>  
> @@ -111,11 +113,18 @@ void *User::operator new(size_t s) {
>  //===----------------------------------------------------------------------===//
>  
>  void User::operator delete(void *Usr) {
> +  // Hung off uses use a single Use* before the User, while other subclasses
> +  // use a Use[] allocated prior to the user.
>    User *Start = static_cast<User*>(Usr);

`Start` is a bad name.  For the hung-off case, it's meaningless, and for
the normal case, this is actually the "end" of a range.

To match the logic in `operator new` I'd just call this `Obj`, which
despite being somewhat meaningless at least isn't distracting.  (Yeah, I
know you didn't pick this name, but you've changed all the other code so
you might as well clean this up too.)

> -  Use *Storage = static_cast<Use*>(Usr) - Start->NumUserOperands;
> -  // If there were hung-off uses, they will have been freed already and
> -  // NumOperands reset to 0, so here we just free the User itself.
> -  ::operator delete(Storage);
> +  if (Start->HasHungOffUses) {
> +    Use **Storage = static_cast<Use**>(Usr) - 1;

clang-format?

> +    // If there were hung-off uses, they will have been freed already and
> +    // NumOperands reset to 0, so here we just free the User itself.

Should any comments be changed in a follow-up?

Can you add a `assert(!*Storage)`?

I also wonder whether it would be a good idea in follow-up to move the
hung-off uses deletion to *this* method, changing this to:

    Use **Storage = ...;
    // Free hung off uses, if any.  
    Use::zap(*Storage, *Storage + Start->NumUserOperands,
             /* Delete */ true);
    ::operator delete(Storage);

It makes way more sense to me here: `operator new` is initialized the field,
so `operator delete` should tear it down.

> +    ::operator delete(Storage);
> +  } else {
> +    Use *Storage = static_cast<Use*>(Usr) - Start->NumUserOperands;
> +    ::operator delete(Storage);

Final nitpick: I'd skip the `Storage` variable.

> +  }
>  }
>  
>  //===----------------------------------------------------------------------===//
> -- 
> 2.3.4 (Apple Git-56)






More information about the llvm-commits mailing list