[all-commits] [llvm/llvm-project] 57ed62: [memprof] Speed up caller-callee pair extraction (...

Kazu Hirata via All-commits all-commits at lists.llvm.org
Fri Nov 15 15:33:44 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 57ed628fb397c6427f820fb217c8a58e67f8e10a
      https://github.com/llvm/llvm-project/commit/57ed628fb397c6427f820fb217c8a58e67f8e10a
  Author: Kazu Hirata <kazu at google.com>
  Date:   2024-11-15 (Fri, 15 Nov 2024)

  Changed paths:
    M llvm/include/llvm/ProfileData/MemProf.h
    M llvm/lib/ProfileData/InstrProfReader.cpp

  Log Message:
  -----------
  [memprof] Speed up caller-callee pair extraction (Part 2) (#116441)

This patch further speeds up the extraction of caller-callee pairs
from the profile.

Recall that we reconstruct a call stack by traversing the radix tree
from one of its leaf nodes toward a root.  The implication is that
when we decode many different call stacks, we end up visiting nodes
near the root(s) repeatedly.  That in turn adds many duplicates to our
data structure:

  DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> Calls;

only to be deduplicated later with sort+unique for each vector.

This patch makes the extraction process more efficient by keeping
track of indices of the radix tree array we've visited so far and
terminating traversal as soon as we encounter an element previously
visited.

Note that even with this improvement, we still add at least one
caller-callee pair to the data structure above for each call stack
because we do need to add a caller-callee pair for the leaf node with
the callee GUID being 0.

Without this patch, it takes 4 seconds to extract caller-callee pairs
from a large MemProf profile.  This patch shortenes that down to
900ms.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list