[PATCH] Use Rvalue refs in APInt

Mon Jun 6 15:59:11 PDT 2016

On Sun, Jun 5, 2016 at 11:55 AM, Pete Cooper <peter_cooper at apple.com> wrote:

>
> On Jun 4, 2016, at 9:10 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
> Looks like you just have a variable rename in the unary minus
> implementation - commit separately or drop that change perhaps?
>
> Oh yeah.  Good catch.  Will do that separately.
>
>
> I don't think you need versions that take two rvalue refs (+(&&, &&)), is
> there? (until +=/-= get smart and have an overload that takes an rvalue ref
> parameter and uses it to steal the right hand side buffer if it's bigger
> than the left hand side buffer or something like that?)
>
> I had a compile error somewhere in the LLVM codebase without this
> version.  I can’t remember where it is, but a small test (attached to the
> end of the email if you want to hack on it) which triggers it is:
>
> rvalue.cpp:66:22: error: use of overloaded operator '+' is ambiguous (with
> operand types 'APInt' and 'APInt')
>   APInt d2 = (a * b) + (a * b);
>
>
> & can you pass by value instead of by rvalue ref - does that work/choose
> the right overloads?
>
> Doesn’t seem to.  Using the above as an example, if I remove the && from
> both arguments then I get:
>
> rvalue.cpp:72:22: error: use of overloaded operator '+' is ambiguous (with
> operand types 'APInt' and 'APInt')
>   APInt d2 = (a * b) + (a * b);
>              ~~~~~~~ ^ ~~~~~~~
> rvalue.cpp:35:14: note: candidate function
> inline APInt operator+(APInt a, APInt b) {
>              ^
> rvalue.cpp:41:14: note: candidate function
> inline APInt operator+(APInt &&a, const APInt &b) {
>              ^
> rvalue.cpp:47:14: note: candidate function
> inline APInt operator+(const APInt &a, APInt &&b) {
>              ^
> rvalue.cpp:53:14: note: candidate function
> inline APInt operator+(const APInt &a, const APInt &b) {
>
> Note, removing the && from all the variants doesn’t improve the situation.
>

Attached an example that I think works - but I haven't tested. There may be
some accidental infinite recursion in there - the version of the patch you
have didn't seem to pass all the tests anyway.

(also noticed you ended up with both member and non-member version of unary
operator-, my patch drops the member one (you could probably move operator~
out as a non-member too, but that's pretty orthogonal))


>
> inline APInt operator-(APInt RHS, const APInt &LHS) {
>   RHS += LHS;
>   return RHS; // shouldn't need std::move here because you're returning a
> local
> }
>
> I wondered about this too.  I turned on -Wpessimizing-move to see if what
> I was doing was wrong but it didn’t fire.  Interestingly, with this method:
>
> inline APInt operator+(APInt &&a, const APInt &b) {
>   printf("APInt::+(&&, &)\n");
>   a += b;
>   return a;
>
>
This one shouldn't produce a move (& you should add std::move explicitly)
because 'a' is not a local here, it's a reference. When it's passed by
value there's no need for the std::move:

blaikie at blaikie-linux:/tmp/dbginfo$ cat -n test.cpp
     1  struct foo {
     2    foo(foo&&) = default;
     3  };
     4
     5  foo f(foo g) {
     6    return g;
     7  }
     8  foo f2(foo &&g) {
     9    return g;
    10  }
blaikie at blaikie-linux:/tmp/dbginfo$ clang++-tot -std=c++11 test.cpp
-fsyntax-only
test.cpp:9:10: error: call to implicitly-deleted copy constructor of 'foo'
  return g;
         ^
test.cpp:2:3: note: copy constructor is implicitly deleted because 'foo'
has a user-declared move constructor
  foo(foo&&) = default;
  ^
1 error generated.


> }
>
> and with/without the std::move on the return.  The above version will
> call APInt::APInt(&) but the std::move version will call APInt::APInt(&&).
> I used printfs to verify this.  So looks like there is a difference here,
> even though I totally agree with you that we’re returning a local so it
> shouldn’t need the std::move.  I’m not sure if this is a bug, or just
> subtlety in rvalue semantics.  Would love to know the answer though.
>
>
> Then you shouldn't need the op-(const APInt&,const APInt&) version, for
> example.
>
> Not sure if its a result of the other &&’s ending up being required, but
> i’ve tested without a (const APInt&,const APInt&) version and I get
> ambiguous overload errors.  Seems like i’m going to need it.
>
>
> Tests?
>
> I was wondering about this.  I can certainly test all the variants to make
> sure I get the correct numerical results from APInt and I’ll add what tests
> are needed for that.  I wouldn’t be able to test whether we get a certain
> number of malloc’s, unless its ok to implement my own malloc/free the APInt
> unit test?
>

Yeah, I'd certainly at least test that we get all the right answers
(potentially using a data expanded test to exercise all the operations with
the same values for different combinations of lvalues, rvalues, and uints).

As for testing the avoidance of allocation... hrm... I mean it's
essentially a non-observable performance thing, and our tests don't really
test performance, so perhaps that's fine. In theory you could test that
moving happened by caching the result of "getRawData" and check that the
pointer value is the same? Not sure if that's a good test.


>
> Thanks for all the comments so far.  Will try get an updated patch
> tomorrow.
>
> Cheers,
> Pete
>
>
>
>
> On Fri, Jun 3, 2016 at 10:42 AM, Pete Cooper <peter_cooper at apple.com>
> wrote:
>
>> Hi David, Sanjoy
>>
>> Here’s an updated patch which provides all of the combinations of
>> operator[+-] I can think to add.
>>
>> All of the new ones are outside the class definition so that we can
>> reduce duplication and have them call each other.
>>
>> The one thing I noticed while doing this work was that the already
>> existing operator+= and -= methods really did exactly what I wanted.  So
>> i’ve implemented + and - in terms of += and -=.
>>
>> Is that ok, or is it frowned upon?  I can imagine some people would
>> prefer that += calls + and not the other way round.  But it is very
>> convenient as you can see with this patch.
>>
>> Comments very welcome.
>>
>> BTW, this reduces total allocations by about 400k from 19.1M to 18.7M.
>>
>> Cheers,
>> Pete
>>
>>
>>
>> On Jun 2, 2016, at 3:24 PM, Pete Cooper via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>>
>> On Jun 2, 2016, at 2:28 PM, Sanjoy Das <sanjoy at playingwithpointers.com>
>> wrote:
>>
>> On Wed, Jun 1, 2016 at 9:43 AM, Pete Cooper <peter_cooper at apple.com>
>> wrote:
>>
>> Another interesting data point is the compile time.  On my test case,
>> SCEV::getRange is 8.9% of compile time which is a lot.  But of that, 6.3%
>> is just in ConstantRange::multiply.  This method is heavy APInt code, and
>> especially malloc traffic.
>>
>>
>> Yeah, that is definitely too high! Just to check: I assume you mean
>> 8.9% of opt -O2 or something similar?
>>
>> Yep, thats right.  ‘opt -O2 verify-uselistorder.bc -o opt.bc’.  The
>> verify-uselistorder is the pre optimized, but post linked, bitcode when
>> LTOing that tool.
>>
>> BTW, I just looked at the latest numbers and the commits i’ve made so far
>> save 3% of compile time on this use case.  So the 8.9% is more like 5.9%
>> now.  And still a little more to come.
>>
>>
>> Many of the speedup’s i’ve been finding involve doing less work (r271020
>> which avoids the latter half of ConstantRange::multiply and saves 3M
>> allocations), and fixing cases of unnecessary APInt allocations (r270959).
>> This patch is along the same lines as the latter where we have malloc
>> traffic we can avoid.
>>
>>
>> Making too many fixes on the APInt algorithms to avoid allocations
>> seems like we're solving the issue at the wrong layer.  I think fixing
>> infrastructural issues so that we _can_ be a little sloppy (within
>> reason) in extending integers without thinking too much about malloc
>> traffic is the right path.
>>
>> I completely agree.  There are certainly limits to how far to push this.
>> For example, this code in ConstantRange::multiply:
>>
>>   auto L = {this_min * Other_min, this_min * Other_max,
>>             this_max * Other_min, this_max * Other_max};
>>
>>
>> Once I have the Rvalue ref version of the APInt methods (a change which I
>> think is reasonable), the above could be changed to:
>>
>>   auto L = {this_min * Other_min, std::move(this_min) * Other_max,
>>             this_max * std::move(Other_min), std::move(this_max) *
>> Other_max};
>>
>> This would avoid 3 allocations out of 4 because we will then use the
>> Rvalue APInt methods.  However, I think this might
>> be a little too much hacking.  So yeah, I totally agree with you, and
>> hopefully we can solve cases like this one in a more
>> reasonable way than gratuitous use of std::move() or other APInt hackery
>> :)
>>
>>
>> But you're doing the work, so you get to decide the path forward. :)
>>
>> Sounds good to me :)
>>
>>
>>
>> ConstantRange stats (bit width and count of hits in
>> ConstantRange::ConstantRange)
>>
>>
>> This is great!  Is this a bootstrap of clang or something?
>>
>> Actually same use case as before.  ‘opt -O2 verify-uselistorder’.  Its a
>> nice small bit code which takes about 20s to optimize.
>>
>>
>> Btw, there are couple of bitwidths here that I find interesting, e.g.
>> I'd not have expected this many i70 ConstantRange allocations.
>>
>> Yeah, some of these are a bit surprising.  2^n and (2^n)+1 both seem
>> likely due to the IR itself and SCEV, but anything else is a little odd.  I
>> may take a look at the 258 bit case just because there are so many of them.
>>
>> Pete
>>
>>
>> -- Sanjoy
>>
>> 1: 30850028
>> 2: 7238
>> 3: 5733
>> 4: 92
>> 5: 817
>> 6: 294
>> 7: 192
>> 8: 363498
>> 9: 896
>> 11: 330
>> 12: 378
>> 13: 385
>> 14: 125
>> 16: 30256
>> 18: 272
>> 20: 98
>> 24: 10
>> 25: 62
>> 26: 13
>> 27: 181
>> 28: 8
>> 31: 98
>> 32: 2003134
>> 33: 132
>> 34: 128
>> 36: 76
>> 38: 2130
>> 41: 3
>> 57: 262
>> 58: 244
>> 59: 342
>> 60: 2418
>> 61: 1211
>> 62: 190
>> 63: 226
>> 64: 5118228
>> 65: 128400
>> 66: 4236
>> 67: 14826
>> 68: 15408
>> 69: 13417
>> 70: 7959
>> 71: 347
>> 96: 88
>> 128: 364826
>> 129: 379580
>> 130: 19092
>> 256: 4734
>> 257: 19132
>> 258: 71826
>> 514: 4650
>>
>>
>>
>>
>> --
>> Sanjoy Das
>> http://playingwithpointers.com
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160606/37f56308/attachment.html>
-------------- next part --------------

diff --git include/llvm/ADT/APInt.h include/llvm/ADT/APInt.h
index cdb72b9..426003d 100644
--- include/llvm/ADT/APInt.h
+++ include/llvm/ADT/APInt.h
@@ -40,6 +40,10 @@ const unsigned int host_char_bit = 8;
 const unsigned int integerPartWidth =
     host_char_bit * static_cast<unsigned int>(sizeof(integerPart));
 
+class APInt;
+
+inline APInt operator-(APInt);
+
 //===----------------------------------------------------------------------===//
 //                              APInt Class
 //===----------------------------------------------------------------------===//
@@ -620,18 +624,6 @@ public:
     return Result;
   }
 
-  /// \brief Unary negation operator
-  ///
-  /// Negates *this using two's complement logic.
-  ///
-  /// \returns An APInt value representing the negation of *this.
-  APInt operator-() const {
-    APInt Result(*this);
-    Result.flipAllBits();
-    ++Result;
-    return Result;
-  }
-
   /// \brief Logical negation operator.
   ///
   /// Performs logical negation operation on this APInt.
@@ -750,6 +742,7 @@ public:
   ///
   /// \returns *this
   APInt &operator+=(const APInt &RHS);
+  APInt &operator+=(uint64_t RHS);
 
   /// \brief Subtraction assignment operator.
   ///
@@ -757,6 +750,7 @@ public:
   ///
   /// \returns *this
   APInt &operator-=(const APInt &RHS);
+  APInt &operator-=(uint64_t RHS);
 
   /// \brief Left-shift assignment function.
   ///
@@ -836,18 +830,6 @@ public:
   /// Multiplies this APInt by RHS and returns the result.
   APInt operator*(const APInt &RHS) const;
 
-  /// \brief Addition operator.
-  ///
-  /// Adds RHS to this APInt and returns the result.
-  APInt operator+(const APInt &RHS) const;
-  APInt operator+(uint64_t RHS) const;
-
-  /// \brief Subtraction operator.
-  ///
-  /// Subtracts RHS from this APInt and returns the result.
-  APInt operator-(const APInt &RHS) const;
-  APInt operator-(uint64_t RHS) const;
-
   /// \brief Left logical shift operator.
   ///
   /// Shifts this APInt left by \p Bits and returns the result.
@@ -1532,7 +1514,9 @@ public:
 
   /// \returns the ceil log base 2 of this APInt.
   unsigned ceilLogBase2() const {
-    return BitWidth - (*this - 1).countLeadingZeros();
+    APInt temp(*this);
+    --temp;
+    return BitWidth - temp.countLeadingZeros();
   }
 
   /// \returns the nearest log base 2 of this APInt. Ties round up.
@@ -1750,6 +1734,43 @@ inline raw_ostream &operator<<(raw_ostream &OS, const APInt &I) {
   return OS;
 }
 
+inline APInt operator-(APInt v) {
+  v.flipAllBits();
+  ++v;
+  return v;
+}
+
+inline APInt operator+(APInt a, const APInt &b) {
+  a += b;
+  return a;
+}
+
+inline APInt operator+(const APInt &a, APInt &&b) {
+  b += a;
+  return std::move(b);
+}
+
+inline APInt operator+(APInt a, uint64_t RHS) {
+  a += RHS;
+  return a;
+}
+
+inline APInt operator-(APInt a, const APInt &b) {
+  a -= b;
+  return a;
+}
+
+inline APInt operator-(const APInt &a, APInt &&b) {
+  b -= a;
+  return std::move(b);
+}
+
+inline APInt operator-(APInt a, uint64_t RHS) {
+  a -= RHS;
+  return a;
+}
+
+
 namespace APIntOps {
 
 /// \brief Determine the smaller of two APInts considered to be signed.
diff --git lib/Support/APInt.cpp lib/Support/APInt.cpp
index 8aec547..5857888 100644
--- lib/Support/APInt.cpp
+++ lib/Support/APInt.cpp
@@ -260,6 +260,14 @@ APInt& APInt::operator+=(const APInt& RHS) {
   return clearUnusedBits();
 }
 
+APInt& APInt::operator+=(uint64_t RHS) {
+  if (isSingleWord())
+    VAL += RHS;
+  else
+    add_1(pVal, pVal, getNumWords(), RHS);
+  return clearUnusedBits();
+}
+
 /// Subtracts the integer array y from the integer array x
 /// @returns returns the borrow out.
 /// @brief Generalized subtraction of 64-bit integer arrays.
@@ -286,6 +294,14 @@ APInt& APInt::operator-=(const APInt& RHS) {
   return clearUnusedBits();
 }
 
+APInt& APInt::operator-=(uint64_t RHS) {
+  if (isSingleWord())
+    VAL -= RHS;
+  else
+    sub_1(pVal, getNumWords(), RHS);
+  return clearUnusedBits();
+}
+
 /// Multiplies an integer array, x, by a uint64_t integer and places the result
 /// into dest.
 /// @returns the carry out of the multiplication.
@@ -470,44 +486,6 @@ APInt APInt::operator*(const APInt& RHS) const {
   return Result;
 }
 
-APInt APInt::operator+(const APInt& RHS) const {
-  assert(BitWidth == RHS.BitWidth && "Bit widths must be the same");
-  if (isSingleWord())
-    return APInt(BitWidth, VAL + RHS.VAL);
-  APInt Result(BitWidth, 0);
-  add(Result.pVal, this->pVal, RHS.pVal, getNumWords());
-  Result.clearUnusedBits();
-  return Result;
-}
-
-APInt APInt::operator+(uint64_t RHS) const {
-  if (isSingleWord())
-    return APInt(BitWidth, VAL + RHS);
-  APInt Result(*this);
-  add_1(Result.pVal, Result.pVal, getNumWords(), RHS);
-  Result.clearUnusedBits();
-  return Result;
-}
-
-APInt APInt::operator-(const APInt& RHS) const {
-  assert(BitWidth == RHS.BitWidth && "Bit widths must be the same");
-  if (isSingleWord())
-    return APInt(BitWidth, VAL - RHS.VAL);
-  APInt Result(BitWidth, 0);
-  sub(Result.pVal, this->pVal, RHS.pVal, getNumWords());
-  Result.clearUnusedBits();
-  return Result;
-}
-
-APInt APInt::operator-(uint64_t RHS) const {
-  if (isSingleWord())
-    return APInt(BitWidth, VAL - RHS);
-  APInt Result(*this);
-  sub_1(Result.pVal, getNumWords(), RHS);
-  Result.clearUnusedBits();
-  return Result;
-}
-
 bool APInt::EqualSlowCase(const APInt& RHS) const {
   return std::equal(pVal, pVal + getNumWords(), RHS.pVal);
 }