[llvm] [memprof] Use std::vector<Frame> instead of llvm::SmallVector<Frame> (NFC) (PR #94432)

Kazu Hirata via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 6 13:23:30 PDT 2024


kazutakahirata wrote:

> > This patch replaces llvm::SmallVector with std::vector.
> > llvm::SmallVector sets aside one inline element.
> 
> I thought it allocated a few elements (I guess more than 2 from what you found below)?

No, by default, the number of inlined elements is the maximum of:

- 1
- the number of elements that would fit within 64 bytes, including `SmallVector`'s data pointer, the length field, etc.

So, `SmallVector<Frame>` has only one inlined element because `sizeof(Frame) == 32` (after https://github.com/llvm/llvm-project/pull/94655).

> > Meanwhile, when I sort all call stacks by their lengths, the length at the first percentile is already 2. That is, 99 percent of call stacks do not take advantage of the inline element.
> 
> I'm surprised that almost all are <=2: the allocation call contexts should almost always be longer than 2 frames in practice. Is this dominated by the CallSites frames? I can imagine those are short. Maybe that should use std::vector (or SmallVector<...,0>), and the AllocationInfo CallStack remain SmallVector?

I think you've got the percentiles backwards.  I'm sorting the call stacks in the ascending order of their lengths.  Here are the stats for call stack lengths:

- the 1st percentile: 2 frames
- the 5th percentile: 5 frames
- the median: 72 frames

So, we rarely use the inlined element of `SmallVector<Frame>`.

With the smaller baseline cycle/instruction counts thanks to https://github.com/llvm/llvm-project/pull/94655, which just landed, using `std::vector<Frame>` here reduces the cycle and instruction counts by 11% and 22%, respectively.  I've updated the commit message of this PR accordingly.


https://github.com/llvm/llvm-project/pull/94432


More information about the llvm-commits mailing list