[PATCH] Use Rvalue refs in APInt

Pete Cooper via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 1 09:43:09 PDT 2016


Hi Sanjoy
> On May 31, 2016, at 11:25 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
> 
> Hi Pete,
> 
> Thanks for brining this up, I think this is a relevant topic.  I've
> specifically responded to this bit (I'll take a look at the patch
> tomorrow):
> 
>> Yeah, i’ve been wondering about a BumpPtrAllocator, or even just a
>> greater amount of inline storage.  An APInt is effectively a
>> SmallVector<uint64_t, 1> in some sense but could have 2 or even more
>> elements inline if it made sense.
>> 
>> Given that the majority of long-lived APInt’s are uniqued inside
>> ConstantInt, we may not even see much of a peak memory increase by
>> doing this.  But its just a theory right now, and may be complicated
>> to even try it.
> 
> My guess is that a lot of the malloc traffic, at least from SCEV,
> comes from allocating 65 (== WordSize + 1) and 128 (== WordSize * 2)
> bit APInts, and, as you said, the making APInts store two words inline
> instead of one will solve these cases.
> 
> Btw, have you looked at the memory usage / malloc traffic due to
> ConstantRanges?  
I have yeah.  The list is quite large, so i’ve put it at the end of the email.  Its a histogram of the bit width of all constructed ConstantRange’s vs the number of hits on that bit width. I’ll get some other stats for ConstantRange::multiply soon.  Its an interesting piece of code because it always zext’s the APInt’s so a 64-bit range becomes 128-bits and gets guaranteed allocations.

Another interesting data point is the compile time.  On my test case, SCEV::getRange is 8.9% of compile time which is a lot.  But of that, 6.3% is just in ConstantRange::multiply.  This method is heavy APInt code, and especially malloc traffic.

Many of the speedup’s i’ve been finding involve doing less work (r271020 which avoids the latter half of ConstantRange::multiply and saves 3M allocations), and fixing cases of unnecessary APInt allocations (r270959).  This patch is along the same lines as the latter where we have malloc traffic we can avoid.

> If that too looks like it will be worth optimizing,
> then we can try to kill two birds with one stone.  We can extract out
> a CRTP-like thing containing the "big integer" algorithms and
> (roughly) do:
> 
> // Use this in ConstantInt
> template<unsigned N>
> class APInt<N> : public APIntFunctions<APInt> {
>  unsigned BitWidth;
>  union {
>    uint64_t Val[N];
>    uint64_t *pVal;
>  };
> 
> public:
>  // Helpers to be used by APIntFunctions
> };
This is exactly what i was thinking.  Duncan and I discussed something very similar about a year ago when he was tuning debug info.  Either him or I (can’t remember now) came up with this too.
> 
> 
> This is quite a bit of extra work over just having a SmallVector<>
> type design, but once we have this we can use this to optimize the
> memory usage of ConstantRange.  ConstantRange has two APInts that have
> to be of the same bitwidth, so once we have a separation between logic
> and storage for APInts, we can make a ConstantRange be:
> 
> class ConstantRange {
>  union {
>    uint64_t LeftVal;
>    uint64_t *pLeftVal;
>  };
>  union {
>    uint64_t RightVal;
>    uint64_t *pRightVal;
>  };
>  unsigned BitWidth;
> };
> 
> and have a ConstantRange specific APInt implementation that
> "dispatches" to the right storage as needed.
> 
> 
> 
> If ConstantRange does not look like it is worth optimizing, then a
> straightforward
It certainly looks like ConstantRange is worth optimizing, but I guess the open question for now is how much optimization can just be done on the APInt’s inside it, and would that be enough to avoid having to change ConstantRange itself?  My series of patches I have locally gets the totally allocations down from 26M to 16M, but even after that there are still millions of allocations left in APInt.

Cheers,
Pete
> 
> template<unsigned N>
> class APInt<N> : APIntBaseImpl {
>  unsigned BitWidth;
>  union {
>    uint64_t Val[N];
>    uint64_t *pVal;
>  };
> 
> public:
>  // Implement public interface, using APIntBaseImpl to do the heavy lifting.
> };
> 
> 
> looks reasonable.
> 
> -- Sanjoy

ConstantRange stats (bit width and count of hits in ConstantRange::ConstantRange)
1: 30850028
2: 7238
3: 5733
4: 92
5: 817
6: 294
7: 192
8: 363498
9: 896
11: 330
12: 378
13: 385
14: 125
16: 30256
18: 272
20: 98
24: 10
25: 62
26: 13
27: 181
28: 8
31: 98
32: 2003134
33: 132
34: 128
36: 76
38: 2130
41: 3
57: 262
58: 244
59: 342
60: 2418
61: 1211
62: 190
63: 226
64: 5118228
65: 128400
66: 4236
67: 14826
68: 15408
69: 13417
70: 7959
71: 347
96: 88
128: 364826
129: 379580
130: 19092
256: 4734
257: 19132
258: 71826
514: 4650


More information about the llvm-commits mailing list