[llvm] [memprof] Add CallStackRadixTreeBuilder (PR #93784)
Teresa Johnson via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 6 13:30:18 PDT 2024
================
@@ -410,6 +410,153 @@ CallStackId hashCallStack(ArrayRef<FrameId> CS) {
return CSId;
}
+// Encode a call stack into RadixArray. Return the starting index within
+// RadixArray. For each call stack we encode, we emit two or three components
+// into RadixArray. If a given call stack doesn't have a common prefix relative
+// to the previous one, we emit:
+//
+// - the frames in the given call stack in the root-to-leaf order
+//
+// - the length of the given call stack
+//
+// If a given call stack has a non-empty common prefix relative to the previous
+// one, we emit:
+//
+// - the relative location of the common prefix, encoded as a negative number.
+//
+// - a portion of the given call stack that's beyond the common prefix
+//
+// - the length of the given call stack, including the length of the common
+// prefix.
+//
+// The resulting RadixArray requires a somewhat unintuitive backward traversal
+// to reconstruct a call stack -- read the call stack length and scan backward
+// while collecting frames in the leaf to root order. build, the caller of this
+// function, reverses RadixArray in place so that we can reconstruct a call
+// stack as if we were deserializing an array in a typical way -- the call stack
+// length followed by the frames in the leaf-to-root order except that we need
+// to handle pointers to parents along the way.
+//
+// To quickly determine the location of the common prefix within RadixArray,
+// Indexes caches the indexes of the previous call stack's frames within
+// RadixArray.
+LinearCallStackId CallStackRadixTreeBuilder::encodeCallStack(
+ const llvm::SmallVector<FrameId> *CallStack,
+ const llvm::SmallVector<FrameId> *Prev,
+ const llvm::DenseMap<FrameId, LinearFrameId> &MemProfFrameIndexes) {
+ // Compute the length of the common root prefix between Prev and CallStack.
+ uint32_t CommonLen = 0;
+ if (Prev) {
+ auto Pos = std::mismatch(Prev->rbegin(), Prev->rend(), CallStack->rbegin(),
+ CallStack->rend());
+ CommonLen = std::distance(CallStack->rbegin(), Pos.second);
+ }
+
+ // Drop the portion beyond CommonLen.
+ assert(CommonLen <= Indexes.size());
+ Indexes.resize(CommonLen);
+
+ // Append a pointer to the parent.
+ if (CommonLen) {
+ uint32_t CurrentIndex = RadixArray.size();
+ uint32_t ParentIndex = Indexes.back();
+ // The offset to the parent must be negative because we are pointing to an
+ // element we've already added to RadixArray.
+ assert(ParentIndex < CurrentIndex);
+ RadixArray.push_back(ParentIndex - CurrentIndex);
+ }
+
+ // Copy the part of the call stack beyond the common prefix to RadixArray.
+ assert(CommonLen <= CallStack->size());
+ for (FrameId F : llvm::drop_begin(llvm::reverse(*CallStack), CommonLen)) {
+ // Remember the index of F in RadixArray.
+ Indexes.push_back(RadixArray.size());
+ RadixArray.push_back(MemProfFrameIndexes.find(F)->second);
+ }
+ assert(CallStack->size() == Indexes.size());
+
+ // End with the call stack length.
+ RadixArray.push_back(CallStack->size());
+
+ // Return the index within RadixArray where we can start reconstructing a
+ // given call stack from.
+ return RadixArray.size() - 1;
+}
+
+void CallStackRadixTreeBuilder::build(
+ llvm::MapVector<CallStackId, llvm::SmallVector<FrameId>>
+ &&MemProfCallStackData,
+ const llvm::DenseMap<FrameId, LinearFrameId> &MemProfFrameIndexes) {
+ // Take the vector portion of MemProfCallStackData. The vector is exactly
+ // what we need to sort. Also, we no longer need its lookup capability.
+ llvm::SmallVector<CSIdPair, 0> CallStacks = MemProfCallStackData.takeVector();
+
+ // Sort the list of call stacks in the dictionary order to maximize the length
+ // of the common prefix between two adjacent call stacks.
+ llvm::sort(CallStacks, [&](const CSIdPair &L, const CSIdPair &R) {
+ // Call stacks are stored from leaf to root. Perform comparisons from the
+ // root.
+ return std::lexicographical_compare(
+ L.second.rbegin(), L.second.rend(), R.second.rbegin(), R.second.rend(),
+ [&](FrameId F1, FrameId F2) { return F1 < F2; });
+ });
+
+ // Reserve some reasonable amount of storage.
+ RadixArray.clear();
+ RadixArray.reserve(CallStacks.size() * 8);
+
+ // Indexes will grow as long as the longest call stack.
+ Indexes.clear();
+ Indexes.reserve(512);
+
+ // Compute the radix array. We encode one call stack at a time, computing the
+ // longest prefix that's shared with the previous call stack we encode. For
+ // each call stack we encode, we remember a mapping from CallStackId to its
+ // position within RadixArray.
+ //
+ // As an optimization, we encode from the last call stack in CallStacks to
+ // reduce the number of times we follow pointers to the parents. Consider the
+ // list of call stacks that has been sorted in the dictionary order:
+ //
+ // Call Stack 1: F1
+ // Call Stack 2: F1 -> F2
+ // Call Stack 3: F1 -> F2 -> F3
+ //
+ // If we traversed CallStacks in the forward order, we would end up with a
+ // radix tree like:
+ //
+ // Call Stack 1: F1
+ // |
+ // Call Stack 2: +---> F2
+ // |
+ // Call Stack 3: +---> F3
+ //
+ // Notice that each call stack jumps to the previous one. However, if we
+ // traverse CallStacks in the reverse order, then Call Stack 3 has the
+ // complete call stack encoded without any pointers. Call Stack 1 and 2 point
+ // to appropriate prefixes of Call Stack 3.
+ const llvm::SmallVector<FrameId> *Prev = nullptr;
+ for (const auto &[CSId, CallStack] : llvm::reverse(CallStacks)) {
+ LinearCallStackId Pos =
+ encodeCallStack(&CallStack, Prev, MemProfFrameIndexes);
+ CallStackPos.insert({CSId, Pos});
+ Prev = &CallStack;
+ }
+
+ if (RadixArray.size() >= 2) {
----------------
teresajohnson wrote:
What happens if RadixArray.size() < 2? If it is 0, is that a case where we could/should be returning early from this function? If 1, afaict the loops below will end up as no-ops (first shouldn't execute, second I believe will just set 0 to -0 if I am following how the CallStackPos would be set up in that case).
https://github.com/llvm/llvm-project/pull/93784
More information about the llvm-commits
mailing list