[PATCH] D45756: [XRay][profiler] Part 1: XRay Allocator and Array Implementations

Keith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 19 17:26:32 PDT 2018


kpw added inline comments.


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:52
+  // bidirectional.
+  struct BlockChain {
+    static_assert(kCacheLineSize % sizeof(void *) == 0,
----------------
dberris wrote:
> kpw wrote:
> > This name collides with the cryptographic immutable content-address data structure. I would try to avoid that. BlockListNode?
> Renamed to `BlockLink`.
BlockLink is a good name. Sorry to be a stickler. ;)


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:130
+      auto B = NewChain->Blocks[ChainOffset];
+      Chain->Next = NewChain;
+      Chain = NewChain;
----------------
dberris wrote:
> kpw wrote:
> > NullLink is static right? This seems messy, although I know you don't intend to install multiple allocators.
> Yeah, NullLink is used as a sentinel value, which simplifies a lot of the operations for determining whether it's the end. It could as easily have been a nullptr though.
Now that you're not keeping a forward pointer, having a static as the first link isn't problematic.


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:64-65
+    // individual blocks we've allocated. To ensure that instances of the
+    // BlockLink object are cache-line sized, we deduct three additional
+    // pointers worth representing the pointer to the previous link.
+    //
----------------
This comment is out of date since you've cleaned up the Next ptr and the bitmap.


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:71-72
+    //
+    // Using a 16-bit bitmap will allow us to cover a really big cache line that
+    // might store up to 16 pointers.
+    static constexpr auto BlockPtrCount =
----------------
Bitmap doesn't exist any more.


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:122
+    size_t ChainOffset = Counter % BlockLink::BlockPtrCount;
+    if (UNLIKELY(Tail == &NullLink)) {
+      // FIXME: Maybe pre-allocate some BlockLink instances in the same page
----------------
I don't think this is a special case now that you've made changes. You should only need two arms in your conditional.

In one case, ChainOffset == 0 and you have to allocate a new BlockLink.
In the other arm, you can return a Block from the existing BlockLink arena.


================
Comment at: compiler-rt/lib/xray/xray_allocator.h:134-138
+    // At this point we know that we've wrapped around, and the current chain no
+    // longer has free blocks. We then allocate a new chain link and make that
+    // the current tail.
+    DCHECK_EQ(ChainOffset, 0);
+    auto *NewLink = NewChainLink();
----------------
It looks like both branches of this conditional allocate a new BlockLink.


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:34-35
+/// number from packing as many T's as possible in a single cache line.
+template <class T, size_t N = (kCacheLineSize / sizeof(T)) +
+                              (kCacheLineSize % sizeof(T) ? 1 : 0)>
+struct Array {
----------------
For the N default value, you can't pack an entire T if the remaining bytes are non-zero. I think it can be simplified to kCacheLineSize / sizeof(T)


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:46-47
+    static constexpr size_t Size = N;
+    Chunk *Prev = nullptr;
+    Chunk *Next = nullptr;
+  };
----------------
Did you consider the tradeoffs of these being external to Block versus invasive to the Block?


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:104
+
+  class Iterator {
+    Chunk *C = nullptr;
----------------
Can you call out what iterator category this can be used like even though it doesn't satisfy the complete concept.


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:109
+  public:
+    Iterator(Chunk *IC, size_t O) : C(IC), Offset(O) {}
+
----------------
O -> Offset otherwise it kind of looked like a default zero in the second parameter.


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:124
+    Iterator &operator--() {
+      DCHECK_NE(C, nullptr);
+      if (--Offset % N) {
----------------
Shouldn't you check against the sentinel chunk instead (here and above).


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:125-131
+      if (--Offset % N) {
+        return *this;
+      }
+
+      // At this point, we know that Offset % N == 0, so we must retreat the
+      // chunk pointer.
+      DCHECK_EQ(Offset % N, 0);
----------------
This doesn't seem right. There is still a value at offset 0 within the current chunk when going backward. What you want to check is whether Offset ends up being in the N - 1 congruence class after decrement.


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:138-140
+      if (++Offset % N == 0)
+        C = C->Next;
+      DCHECK_NE(Copy.Offset, this->Offset);
----------------
Can't you just invoke operator++() here?


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:146-148
+      if (--Offset % N == 0)
+        C = C->Prev;
+      DCHECK_NE(Copy.Offset, this->Offset);
----------------
Similarly reuse operator--() code?


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:153
+    friend bool operator==(const Iterator &L, const Iterator &R) {
+      return L.C == R.C && L.Offset == R.Offset;
+    }
----------------
This is fine., but will a compiler generate memcmp equivalent?


================
Comment at: compiler-rt/lib/xray/xray_segmented_array.h:171-172
+
+public:
+  explicit Array(AllocatorType &A) : Allocator(&A) {}
+  Array() : Array(GetGlobalAllocator()) {}
----------------
Haven't looked past here yet. Leaving a note so I know where to come back to.


https://reviews.llvm.org/D45756





More information about the llvm-commits mailing list