[PATCH] D118653: [memprof] Extend the index prof format to include memory profiles.

Snehasish Kumar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 9 16:15:35 PST 2022


snehasish added a comment.

I tried this out on an internal workload. Sharing some characteristics of the new indexed profile containing IR-level instr profiling data and memprof data:

//Raw file sizes (directly written out from the runtime)//
Instr prof: 264M
MemProf: 215M
Instr prof zip: 80M
MemProf zip: 45M

//Indexed format file sizes (after running llvm-profdata merge)//
Indexed instr prof data only: 505M
Indexed instr prof data only zip: 139M
Indexed instr + memprof: 1010M
Indexed instr + memprof zip: 247M

//llvm-profdata merge runtime//
Indexed instr prof data only: 10s
Indexed instr + memprof data: 1m 53s

Looking into the profile using `perf record -e cycles:up`, I found that (unsurprisingly) a significant amount is spent in symbolization. We can significantly reduce the runtime here with caching of PC -> Frame symbolization.

//llvm-profdata merge peak memory usage//
Indexed instr prof data only: 1.4G
Indexed instr + memprof data: 8.16G

Looking into the profile of memory allocations using massif, I saw that it is from DwarfContext::create in the initialize method. This is a little concerning so I'll have to investigate further how we can reduce the reduce the overhead. The debug info generated for this binary is the same as what is used for Sample FDO (`-gmlt` and `-fdebug-info-for-profiling`). The debug information was stored separately using `-gsplit-dwarf` (fission).

I'll follow up after submitting this patch to improve upon the runtime and peak memory usage.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118653/new/

https://reviews.llvm.org/D118653



More information about the llvm-commits mailing list