[PATCH] D45756: [XRay][profiler] Part 1: XRay Allocator and Array Implementations
Keith via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 19 17:26:32 PDT 2018
kpw added inline comments.
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:52
+ // bidirectional.
+ struct BlockChain {
+ static_assert(kCacheLineSize % sizeof(void *) == 0,
----------------
dberris wrote:
> kpw wrote:
> > This name collides with the cryptographic immutable content-address data structure. I would try to avoid that. BlockListNode?
> Renamed to `BlockLink`.
BlockLink is a good name. Sorry to be a stickler. ;)
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:130
+ auto B = NewChain->Blocks[ChainOffset];
+ Chain->Next = NewChain;
+ Chain = NewChain;
----------------
dberris wrote:
> kpw wrote:
> > NullLink is static right? This seems messy, although I know you don't intend to install multiple allocators.
> Yeah, NullLink is used as a sentinel value, which simplifies a lot of the operations for determining whether it's the end. It could as easily have been a nullptr though.
Now that you're not keeping a forward pointer, having a static as the first link isn't problematic.
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:64-65
+ // individual blocks we've allocated. To ensure that instances of the
+ // BlockLink object are cache-line sized, we deduct three additional
+ // pointers worth representing the pointer to the previous link.
+ //
----------------
This comment is out of date since you've cleaned up the Next ptr and the bitmap.
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:71-72
+ //
+ // Using a 16-bit bitmap will allow us to cover a really big cache line that
+ // might store up to 16 pointers.
+ static constexpr auto BlockPtrCount =
----------------
Bitmap doesn't exist any more.
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:122
+ size_t ChainOffset = Counter % BlockLink::BlockPtrCount;
+ if (UNLIKELY(Tail == &NullLink)) {
+ // FIXME: Maybe pre-allocate some BlockLink instances in the same page
----------------
I don't think this is a special case now that you've made changes. You should only need two arms in your conditional.
In one case, ChainOffset == 0 and you have to allocate a new BlockLink.
In the other arm, you can return a Block from the existing BlockLink arena.
================
Comment at: compiler-rt/lib/xray/xray_allocator.h:134-138
+ // At this point we know that we've wrapped around, and the current chain no
+ // longer has free blocks. We then allocate a new chain link and make that
+ // the current tail.
+ DCHECK_EQ(ChainOffset, 0);
+ auto *NewLink = NewChainLink();
----------------
It looks like both branches of this conditional allocate a new BlockLink.
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:34-35
+/// number from packing as many T's as possible in a single cache line.
+template <class T, size_t N = (kCacheLineSize / sizeof(T)) +
+ (kCacheLineSize % sizeof(T) ? 1 : 0)>
+struct Array {
----------------
For the N default value, you can't pack an entire T if the remaining bytes are non-zero. I think it can be simplified to kCacheLineSize / sizeof(T)
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:46-47
+ static constexpr size_t Size = N;
+ Chunk *Prev = nullptr;
+ Chunk *Next = nullptr;
+ };
----------------
Did you consider the tradeoffs of these being external to Block versus invasive to the Block?
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:104
+
+ class Iterator {
+ Chunk *C = nullptr;
----------------
Can you call out what iterator category this can be used like even though it doesn't satisfy the complete concept.
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:109
+ public:
+ Iterator(Chunk *IC, size_t O) : C(IC), Offset(O) {}
+
----------------
O -> Offset otherwise it kind of looked like a default zero in the second parameter.
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:124
+ Iterator &operator--() {
+ DCHECK_NE(C, nullptr);
+ if (--Offset % N) {
----------------
Shouldn't you check against the sentinel chunk instead (here and above).
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:125-131
+ if (--Offset % N) {
+ return *this;
+ }
+
+ // At this point, we know that Offset % N == 0, so we must retreat the
+ // chunk pointer.
+ DCHECK_EQ(Offset % N, 0);
----------------
This doesn't seem right. There is still a value at offset 0 within the current chunk when going backward. What you want to check is whether Offset ends up being in the N - 1 congruence class after decrement.
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:138-140
+ if (++Offset % N == 0)
+ C = C->Next;
+ DCHECK_NE(Copy.Offset, this->Offset);
----------------
Can't you just invoke operator++() here?
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:146-148
+ if (--Offset % N == 0)
+ C = C->Prev;
+ DCHECK_NE(Copy.Offset, this->Offset);
----------------
Similarly reuse operator--() code?
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:153
+ friend bool operator==(const Iterator &L, const Iterator &R) {
+ return L.C == R.C && L.Offset == R.Offset;
+ }
----------------
This is fine., but will a compiler generate memcmp equivalent?
================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:171-172
+
+public:
+ explicit Array(AllocatorType &A) : Allocator(&A) {}
+ Array() : Array(GetGlobalAllocator()) {}
----------------
Haven't looked past here yet. Leaving a note so I know where to come back to.
https://reviews.llvm.org/D45756
More information about the llvm-commits
mailing list