[llvm] 964053d - [llvm-profgen] Support LBR only perf script

via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 31 13:28:52 PDT 2021


Author: wlei
Date: 2021-08-31T13:28:17-07:00
New Revision: 964053d56f9b4aeca8950c0835c9e52be0f1d007

URL: https://github.com/llvm/llvm-project/commit/964053d56f9b4aeca8950c0835c9e52be0f1d007
DIFF: https://github.com/llvm/llvm-project/commit/964053d56f9b4aeca8950c0835c9e52be0f1d007.diff

LOG: [llvm-profgen] Support LBR only perf script

This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation.  A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info.

An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`)
```
           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1 ...
           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1 ...
           ...
```

For implementation:
 - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer.
 - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded.
 - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key.
 - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output.

Profile generation part will come soon.

Differential Revision: https://reviews.llvm.org/D108153

Added: 
    llvm/test/tools/llvm-profgen/Inputs/noprobe.aggperfscript
    llvm/test/tools/llvm-profgen/Inputs/noprobe.mmap.perfscript
    llvm/test/tools/llvm-profgen/Inputs/noprobe.perfbin
    llvm/test/tools/llvm-profgen/Inputs/noprobe.perfscript
    llvm/test/tools/llvm-profgen/noprobe.test

Modified: 
    llvm/test/tools/llvm-profgen/cs-interrupt.test
    llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
    llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
    llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
    llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
    llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
    llvm/tools/llvm-profgen/PerfReader.cpp
    llvm/tools/llvm-profgen/PerfReader.h
    llvm/tools/llvm-profgen/ProfileGenerator.cpp
    llvm/tools/llvm-profgen/ProfiledBinary.cpp
    llvm/tools/llvm-profgen/ProfiledBinary.h
    llvm/tools/llvm-profgen/llvm-profgen.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/test/tools/llvm-profgen/Inputs/noprobe.aggperfscript b/llvm/test/tools/llvm-profgen/Inputs/noprobe.aggperfscript
new file mode 100644
index 0000000000000..703f71e5b9262
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/noprobe.aggperfscript
@@ -0,0 +1,6 @@
+1
+           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/16  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/10  0x40062f/0x4005b0/P/-/-/14  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/7  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/15  0x400645/0x4005ff/P/-/-/1
+1
+           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/13  0x40062f/0x4005b0/P/-/-/9
+3
+           4005c8 0x4005c8/0x4005dc/P/-/-/11  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/8

diff  --git a/llvm/test/tools/llvm-profgen/Inputs/noprobe.mmap.perfscript b/llvm/test/tools/llvm-profgen/Inputs/noprobe.mmap.perfscript
new file mode 100644
index 0000000000000..69882da36f4e6
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/noprobe.mmap.perfscript
@@ -0,0 +1,4 @@
+PERF_RECORD_MMAP2 121161/121161: [0x400000(0x1000) @ 0 00:23 10094534 144120]: r-xp /home/noprobe.perfbin
+           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/16  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/10  0x40062f/0x4005b0/P/-/-/14  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/7  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/15  0x400645/0x4005ff/P/-/-/1
+           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/13  0x40062f/0x4005b0/P/-/-/9
+           4005c8 0x4005c8/0x4005dc/P/-/-/11  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/8

diff  --git a/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfbin b/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfbin
new file mode 100755
index 0000000000000..cdf6d04db83a0
Binary files /dev/null and b/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfbin 
diff er

diff  --git a/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfscript b/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfscript
new file mode 100644
index 0000000000000..19bd4a1ee34dc
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/noprobe.perfscript
@@ -0,0 +1,3 @@
+           40062f 0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/16  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/6  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/10  0x40062f/0x4005b0/P/-/-/14  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/7  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/15  0x400645/0x4005ff/P/-/-/1
+           4005d7 0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/11  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/13  0x40062f/0x4005b0/P/-/-/9
+           4005c8 0x4005c8/0x4005dc/P/-/-/11  0x40062f/0x4005b0/P/-/-/8  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/5  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/7  0x40062f/0x4005b0/P/-/-/10  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/8  0x40062f/0x4005b0/P/-/-/9  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/1  0x4005d7/0x4005e5/P/-/-/12  0x40062f/0x4005b0/P/-/-/6  0x400645/0x4005ff/P/-/-/1  0x400637/0x400645/P/-/-/1  0x4005e9/0x400634/P/-/-/2  0x4005c8/0x4005dc/P/-/-/8  0x40062f/0x4005b0/P/-/-/8

diff  --git a/llvm/test/tools/llvm-profgen/cs-interrupt.test b/llvm/test/tools/llvm-profgen/cs-interrupt.test
index 70d27a874fe04..f0fdb95d61bc2 100644
--- a/llvm/test/tools/llvm-profgen/cs-interrupt.test
+++ b/llvm/test/tools/llvm-profgen/cs-interrupt.test
@@ -1,6 +1,9 @@
 ;; The test fails on Windows. Fix it before removing the following requirement.
 ; REQUIRES: x86_64-linux
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/cs-interrupt.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/cs-interrupt.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/cs-interrupt.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --profile-summary-cold-count=0
+>>>>>>> 02ea7084c370 ([llvm-profgen] Support LBR only perf script)
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:[main:1 @ foo]:88:0

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
index 91859badaacaf..ae7e6e5f932f2 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-noprobe.test
@@ -1,4 +1,6 @@
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-noprobe.perfscript --binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-noprobe.perfscript --binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+| FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-noprobe.perfscript --binary=%S/Inputs/inline-cs-noprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:[main:1 @ foo]:309:0

diff  --git a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
index d62c8c3acaad9..04faab2fda3e0 100644
--- a/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/inline-cs-pseudoprobe.test
@@ -1,4 +1,6 @@
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-cs-pseudoprobe.perfscript --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
 
 ; CHECK:     [main:2 @ foo]:74:0

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
index 492e3abd2e520..0e9d94cf3df1c 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-noprobe.test
@@ -1,10 +1,15 @@
 ;; The test fails on Windows. Fix it before removing the following requirement.
 ; REQUIRES: x86_64-linux
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.perfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.aggperfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-AGG-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.aggperfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-AGG-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-noprobe.aggperfscript --binary=%S/Inputs/noinline-cs-noprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-AGG
 
+
 ; CHECK-AGG:[main:1 @ foo]:108:0
 ; CHECK-AGG: 2: 6
 ; CHECK-AGG: 3: 6 bar:6

diff  --git a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
index 32da466e18536..314cb88080611 100644
--- a/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/noinline-cs-pseudoprobe.test
@@ -1,6 +1,10 @@
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.perfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.aggperfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.aggperfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noinline-cs-pseudoprobe.aggperfscript --binary=%S/Inputs/noinline-cs-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
 
 

diff  --git a/llvm/test/tools/llvm-profgen/noprobe.test b/llvm/test/tools/llvm-profgen/noprobe.test
new file mode 100644
index 0000000000000..fb705baffa1a4
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/noprobe.test
@@ -0,0 +1,63 @@
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noprobe.perfscript --binary=%S/Inputs/noprobe.perfbin --output=%t --skip-symbolization
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-RAW-PROFILE
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noprobe.mmap.perfscript --binary=%S/Inputs/noprobe.perfbin --output=%t --skip-symbolization
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-RAW-PROFILE
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/noprobe.aggperfscript --binary=%S/Inputs/noprobe.perfbin --output=%t --skip-symbolization
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-RAW-PROFILE-AGG
+
+CHECK-RAW-PROFILE:      7
+CHECK-RAW-PROFILE-NEXT: 5b0-5c8:7
+CHECK-RAW-PROFILE-NEXT: 5b0-5d7:13
+CHECK-RAW-PROFILE-NEXT: 5dc-5e9:6
+CHECK-RAW-PROFILE-NEXT: 5e5-5e9:12
+CHECK-RAW-PROFILE-NEXT: 5ff-62f:19
+CHECK-RAW-PROFILE-NEXT: 634-637:18
+CHECK-RAW-PROFILE-NEXT: 645-645:18
+CHECK-RAW-PROFILE-NEXT: 6
+CHECK-RAW-PROFILE-NEXT: 5c8->5dc:7
+CHECK-RAW-PROFILE-NEXT: 5d7->5e5:13
+CHECK-RAW-PROFILE-NEXT: 5e9->634:18
+CHECK-RAW-PROFILE-NEXT: 62f->5b0:21
+CHECK-RAW-PROFILE-NEXT: 637->645:18
+CHECK-RAW-PROFILE-NEXT: 645->5ff:19
+
+
+CHECK-RAW-PROFILE-AGG:      7
+CHECK-RAW-PROFILE-AGG-NEXT: 5b0-5c8:13
+CHECK-RAW-PROFILE-AGG-NEXT: 5b0-5d7:21
+CHECK-RAW-PROFILE-AGG-NEXT: 5dc-5e9:10
+CHECK-RAW-PROFILE-AGG-NEXT: 5e5-5e9:20
+CHECK-RAW-PROFILE-AGG-NEXT: 5ff-62f:31
+CHECK-RAW-PROFILE-AGG-NEXT: 634-637:30
+CHECK-RAW-PROFILE-AGG-NEXT: 645-645:30
+CHECK-RAW-PROFILE-AGG-NEXT: 6
+CHECK-RAW-PROFILE-AGG-NEXT: 5c8->5dc:13
+CHECK-RAW-PROFILE-AGG-NEXT: 5d7->5e5:21
+CHECK-RAW-PROFILE-AGG-NEXT: 5e9->634:30
+CHECK-RAW-PROFILE-AGG-NEXT: 62f->5b0:35
+CHECK-RAW-PROFILE-AGG-NEXT: 637->645:30
+CHECK-RAW-PROFILE-AGG-NEXT: 645->5ff:31
+
+
+; original code:
+; clang -O3 -g -debug-info-for-profiling test.c -o a.out
+#include <stdio.h>
+
+int bar(int x, int y) {
+  if (x % 3) {
+    return x - y;
+  }
+  return x + y;
+}
+
+void foo() {
+  int s, i = 0;
+  while (i++ < 4000 * 4000)
+    if (i % 91) s = bar(i, s); else s += 30;
+  printf("sum is %d\n", s);
+}
+
+int main() {
+  foo();
+  return 0;
+}

diff  --git a/llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test b/llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
index f7cc7fd971d70..72acb89bbb0f7 100644
--- a/llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
+++ b/llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
@@ -1,9 +1,13 @@
 ; Firstly test uncompression(--compress-recursion=0)
 ; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=0 --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t -check-prefix=CHECK-UNCOMPRESS
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
-; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe-nommap.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --show-unwinder-output --profile-summary-cold-count=0 | FileCheck %s --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe-nommap.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-cold-count=0
+; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
+; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe-nommap.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --profile-summary-cold-count=0
 ; RUN: FileCheck %s --input-file %t
 ; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=0 --profile-summary-cold-count=0 --csprof-max-context-depth=0
 ; RUN: FileCheck %s --input-file %t -check-prefix=CHECK-MAX-CTX-DEPTH

diff  --git a/llvm/tools/llvm-profgen/PerfReader.cpp b/llvm/tools/llvm-profgen/PerfReader.cpp
index a5c970e9b89f0..14402730cb8fc 100644
--- a/llvm/tools/llvm-profgen/PerfReader.cpp
+++ b/llvm/tools/llvm-profgen/PerfReader.cpp
@@ -7,18 +7,21 @@
 //===----------------------------------------------------------------------===//
 #include "PerfReader.h"
 #include "ProfileGenerator.h"
+#include "llvm/Support/FileSystem.h"
 
 static cl::opt<bool> ShowMmapEvents("show-mmap-events", cl::ReallyHidden,
                                     cl::init(false), cl::ZeroOrMore,
                                     cl::desc("Print binary load events."));
 
-static cl::opt<bool> ShowUnwinderOutput("show-unwinder-output",
-                                        cl::ReallyHidden, cl::init(false),
-                                        cl::ZeroOrMore,
-                                        cl::desc("Print unwinder output"));
+cl::opt<bool> SkipSymbolization("skip-symbolization", cl::ReallyHidden,
+                                cl::init(false), cl::ZeroOrMore,
+                                cl::desc("Dump the unsumbolized profile to the "
+                                         "output file. It will show unwinder "
+                                         "output for CS profile generation."));
 
 extern cl::opt<bool> ShowDisassemblyOnly;
 extern cl::opt<bool> ShowSourceLocations;
+extern cl::opt<std::string> OutputFilename;
 
 namespace llvm {
 namespace sampleprof {
@@ -190,9 +193,9 @@ void VirtualUnwinder::recordBranchCount(const LBREntry &Branch,
   }
 }
 
-bool VirtualUnwinder::unwind(const HybridSample *Sample, uint64_t Repeat) {
+bool VirtualUnwinder::unwind(const PerfSample *Sample, uint64_t Repeat) {
   // Capture initial state as starting point for unwinding.
-  UnwindState State(Sample);
+  UnwindState State(Sample, Binary);
 
   // Sanity check - making sure leaf of LBR aligns with leaf of stack sample
   // Stack sample sometimes can be unreliable, so filter out bogus ones.
@@ -250,8 +253,7 @@ PerfReaderBase::create(ProfiledBinary *Binary,
   if (PerfType == PERF_LBR_STACK) {
     PerfReader.reset(new HybridPerfReader(Binary));
   } else if (PerfType == PERF_LBR) {
-    // TODO:
-    exitWithError("Unsupported perfscript!");
+    PerfReader.reset(new LBRPerfReader(Binary));
   } else {
     exitWithError("Unsupported perfscript!");
   }
@@ -308,12 +310,13 @@ void PerfReaderBase::updateBinaryAddress(const MMapEvent &Event) {
 // Use ordered map to make the output deterministic
 using OrderedCounterForPrint = std::map<std::string, RangeSample>;
 
-static void printSampleCounter(OrderedCounterForPrint &OrderedCounter) {
+static void printSampleCounter(OrderedCounterForPrint &OrderedCounter,
+                               raw_fd_ostream &OS) {
   for (auto Range : OrderedCounter) {
-    outs() << Range.first << "\n";
+    OS << Range.first << "\n";
     for (auto I : Range.second) {
-      outs() << "  (" << format("%" PRIx64, I.first.first) << ", "
-             << format("%" PRIx64, I.first.second) << "): " << I.second << "\n";
+      OS << "  (" << format("%" PRIx64, I.first.first) << ", "
+         << format("%" PRIx64, I.first.second) << "): " << I.second << "\n";
     }
   }
 }
@@ -336,41 +339,43 @@ static std::string getContextKeyStr(ContextKey *K,
 }
 
 static void printRangeCounter(ContextSampleCounterMap &Counter,
-                              const ProfiledBinary *Binary) {
+                              const ProfiledBinary *Binary,
+                              raw_fd_ostream &OS) {
   OrderedCounterForPrint OrderedCounter;
   for (auto &CI : Counter) {
     OrderedCounter[getContextKeyStr(CI.first.getPtr(), Binary)] =
         CI.second.RangeCounter;
   }
-  printSampleCounter(OrderedCounter);
+  printSampleCounter(OrderedCounter, OS);
 }
 
 static void printBranchCounter(ContextSampleCounterMap &Counter,
-                               const ProfiledBinary *Binary) {
+                               const ProfiledBinary *Binary,
+                               raw_fd_ostream &OS) {
   OrderedCounterForPrint OrderedCounter;
   for (auto &CI : Counter) {
     OrderedCounter[getContextKeyStr(CI.first.getPtr(), Binary)] =
         CI.second.BranchCounter;
   }
-  printSampleCounter(OrderedCounter);
+  printSampleCounter(OrderedCounter, OS);
 }
 
-void HybridPerfReader::printUnwinderOutput() {
-    outs() << "Binary(" << Binary->getName().str() << ")'s Range Counter:\n";
-    printRangeCounter(SampleCounters, Binary);
-    outs() << "\nBinary(" << Binary->getName().str() << ")'s Branch Counter:\n";
-    printBranchCounter(SampleCounters, Binary);
+void HybridPerfReader::writeRawProfile(raw_fd_ostream &OS) {
+  OS << "Binary(" << Binary->getName().str() << ")'s Range Counter:\n";
+  printRangeCounter(SampleCounters, Binary, OS);
+  OS << "\nBinary(" << Binary->getName().str() << ")'s Branch Counter:\n";
+  printBranchCounter(SampleCounters, Binary, OS);
 }
 
 void HybridPerfReader::unwindSamples() {
   for (const auto &Item : AggregatedSamples) {
-    const HybridSample *Sample = dyn_cast<HybridSample>(Item.first.getPtr());
+    const PerfSample *Sample = Item.first.getPtr();
     VirtualUnwinder Unwinder(&SampleCounters, Binary);
     Unwinder.unwind(Sample, Item.second);
   }
 
-  if (ShowUnwinderOutput)
-    printUnwinderOutput();
+  if (SkipSymbolization)
+    PerfReaderBase::writeRawProfile(OutputFilename);
 }
 
 bool PerfReaderBase::extractLBRStack(TraceStream &TraceIt,
@@ -380,10 +385,9 @@ bool PerfReaderBase::extractLBRStack(TraceStream &TraceIt,
   //                           ... 0x4005c8/0x4005dc/P/-/-/0
   // It's in FIFO order and seperated by whitespace.
   SmallVector<StringRef, 32> Records;
-  TraceIt.getCurrentLine().split(Records, " ");
+  TraceIt.getCurrentLine().split(Records, " ", -1, false);
 
-  // Extract leading instruction pointer if present, use single
-  // list to pass out as reference.
+  // Skip the leading instruction pointer.
   size_t Index = 0;
   if (!Records.empty() && Records[0].find('/') == StringRef::npos) {
     Index = 1;
@@ -517,6 +521,15 @@ bool PerfReaderBase::extractCallstack(TraceStream &TraceIt,
          !Binary->addressInPrologEpilog(CallStack.front());
 }
 
+void PerfReaderBase::warnIfMissingMMap() {
+  if (!Binary->getMissingMMapWarned() && !Binary->getIsLoadedByMMap()) {
+    WithColor::warning() << "No relevant mmap event is matched, will use "
+                            "preferred address as the base loading address!\n";
+    // Avoid redundant warning, only warn at the first unmatched sample.
+    Binary->setMissingMMapWarned(true);
+  }
+}
+
 void HybridPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
   // The raw hybird sample started with call stack in FILO order and followed
   // intermediately by LBR sample
@@ -527,9 +540,9 @@ void HybridPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
   // 0x4005c8/0x4005dc/P/-/-/0   0x40062f/0x4005b0/P/-/-/0 ...
   //          ... 0x4005c8/0x4005dc/P/-/-/0    # LBR Entries
   //
-  std::shared_ptr<HybridSample> Sample = std::make_shared<HybridSample>(Binary);
+  std::shared_ptr<PerfSample> Sample = std::make_shared<PerfSample>();
 
-  // Parsing call stack and populate into HybridSample.CallStack
+  // Parsing call stack and populate into PerfSample.CallStack
   if (!extractCallstack(TraceIt, Sample->CallStack)) {
     // Skip the next LBR line matched current call stack
     if (!TraceIt.isAtEoF() && TraceIt.getCurrentLine().startswith(" 0x"))
@@ -537,21 +550,15 @@ void HybridPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
     return;
   }
 
-  if (!Binary->getMissingMMapWarned() && !Binary->getIsLoadedByMMap()) {
-    WithColor::warning() << "No relevant mmap event is matched, will use "
-                            "preferred address as the base loading address!\n";
-    // Avoid redundant warning, only warn at the first unmatched sample.
-    Binary->setMissingMMapWarned(true);
-  }
+  warnIfMissingMMap();
 
   if (!TraceIt.isAtEoF() && TraceIt.getCurrentLine().startswith(" 0x")) {
-    // Parsing LBR stack and populate into HybridSample.LBRStack
+    // Parsing LBR stack and populate into PerfSample.LBRStack
     if (extractLBRStack(TraceIt, Sample->LBRStack)) {
       // Canonicalize stack leaf to avoid 'random' IP from leaf frame skew LBR
       // ranges
       Sample->CallStack.front() = Sample->LBRStack[0].Target;
       // Record samples by aggregation
-      Sample->genHashCode();
       AggregatedSamples[Hashable<PerfSample>(Sample)] += Count;
     }
   } else {
@@ -560,6 +567,88 @@ void HybridPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
   }
 }
 
+void PerfReaderBase::writeRawProfile(StringRef Filename) {
+  std::error_code EC;
+  raw_fd_ostream OS(Filename, EC, llvm::sys::fs::OF_TextWithCRLF);
+  if (EC)
+    exitWithError(EC, Filename);
+  writeRawProfile(OS);
+}
+
+void LBRPerfReader::writeRawProfile(raw_fd_ostream &OS) {
+  /*
+     Format:
+     number of entries in RangeCounter
+     from_1-to_1:count_1
+     from_2-to_2:count_2
+     ......
+     from_n-to_n:count_n
+     number of entries in BranchCounter
+     src_1->dst_1:count_1
+     src_2->dst_2:count_2
+     ......
+     src_n->dst_n:count_n
+  */
+
+  SampleCounter &Counter = SampleCounters.begin()->second;
+  OS << Counter.RangeCounter.size() << "\n";
+  for (auto I : Counter.RangeCounter) {
+    OS << Twine::utohexstr(I.first.first) << "-"
+       << Twine::utohexstr(I.first.second) << ":" << I.second << "\n";
+  }
+
+  OS << Counter.BranchCounter.size() << "\n";
+  for (auto I : Counter.BranchCounter) {
+    OS << Twine::utohexstr(I.first.first) << "->"
+       << Twine::utohexstr(I.first.second) << ":" << I.second << "\n";
+  }
+}
+
+void LBRPerfReader::computeCounterFromLBR(const PerfSample *Sample,
+                                          uint64_t Repeat) {
+  SampleCounter &Counter = SampleCounters.begin()->second;
+  uint64_t EndOffeset = 0;
+  for (const LBREntry &LBR : Sample->LBRStack) {
+    uint64_t SourceOffset = Binary->virtualAddrToOffset(LBR.Source);
+    uint64_t TargetOffset = Binary->virtualAddrToOffset(LBR.Target);
+
+    if (!LBR.IsArtificial) {
+      Counter.recordBranchCount(SourceOffset, TargetOffset, Repeat);
+    }
+
+    // If this not the first LBR, update the range count between TO of current
+    // LBR and FROM of next LBR.
+    uint64_t StartOffset = TargetOffset;
+    if (EndOffeset != 0) {
+      assert(StartOffset <= EndOffeset &&
+             "Bogus range should be filtered ealier!");
+      Counter.recordRangeCount(StartOffset, EndOffeset, Repeat);
+    }
+    EndOffeset = SourceOffset;
+  }
+}
+
+void LBRPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
+  std::shared_ptr<PerfSample> Sample = std::make_shared<PerfSample>();
+  // Parsing LBR stack and populate into PerfSample.LBRStack
+  if (extractLBRStack(TraceIt, Sample->LBRStack)) {
+    warnIfMissingMMap();
+    // Record LBR only samples by aggregation
+    AggregatedSamples[Hashable<PerfSample>(Sample)] += Count;
+  }
+}
+
+void LBRPerfReader::generateRawProfile() {
+  assert(SampleCounters.size() == 1 && "Must have one entry of sample counter");
+  for (const auto &Item : AggregatedSamples) {
+    const PerfSample *Sample = Item.first.getPtr();
+    computeCounterFromLBR(Sample, Item.second);
+  }
+
+  if (SkipSymbolization)
+    PerfReaderBase::writeRawProfile(OutputFilename);
+}
+
 uint64_t PerfReaderBase::parseAggregatedCount(TraceStream &TraceIt) {
   // The aggregated count is optional, so do not skip the line and return 1 if
   // it's unmatched

diff  --git a/llvm/tools/llvm-profgen/PerfReader.h b/llvm/tools/llvm-profgen/PerfReader.h
index 06fe2bae53d69..41ab200543514 100644
--- a/llvm/tools/llvm-profgen/PerfReader.h
+++ b/llvm/tools/llvm-profgen/PerfReader.h
@@ -121,8 +121,9 @@ template <class T> class Hashable {
   struct Hash {
     uint64_t operator()(const Hashable<T> &Key) const {
       // Don't make it virtual for getHashCode
-      assert(Key.Data->getHashCode() && "Should generate HashCode for it!");
-      return Key.Data->getHashCode();
+      uint64_t Hash = Key.Data->getHashCode();
+      assert(Hash && "Should generate HashCode for it!");
+      return Hash;
     }
   };
 
@@ -137,42 +138,31 @@ template <class T> class Hashable {
   T *getPtr() const { return Data.get(); }
 };
 
-// Base class to extend for all types of perf sample
 struct PerfSample {
-  uint64_t HashCode = 0;
-
-  virtual ~PerfSample() = default;
-  uint64_t getHashCode() const { return HashCode; }
-  virtual bool isEqual(const PerfSample *K) const {
-    return HashCode == K->HashCode;
-  };
-
-  // Utilities for LLVM-style RTTI
-  enum PerfKind { PK_HybridSample };
-  const PerfKind Kind;
-  PerfKind getKind() const { return Kind; }
-  PerfSample(PerfKind K) : Kind(K){};
-};
-
-// The parsed hybrid sample including call stack and LBR stack.
-struct HybridSample : public PerfSample {
-  // Profiled binary that current frame address belongs to
-  ProfiledBinary *Binary;
-  // Call stack recorded in FILO(leaf to root) order
-  SmallVector<uint64_t, 16> CallStack;
-  // LBR stack recorded in FIFO order
+  // LBR stack recorded in FIFO order.
   SmallVector<LBREntry, 16> LBRStack;
+  // Call stack recorded in FILO(leaf to root) order, it's used for CS-profile
+  // generation
+  SmallVector<uint64_t, 16> CallStack;
 
-  HybridSample(ProfiledBinary *B) : PerfSample(PK_HybridSample), Binary(B){};
-  static bool classof(const PerfSample *K) {
-    return K->getKind() == PK_HybridSample;
+  virtual ~PerfSample() = default;
+  uint64_t getHashCode() const {
+    // Use simple DJB2 hash
+    auto HashCombine = [](uint64_t H, uint64_t V) {
+      return ((H << 5) + H) + V;
+    };
+    uint64_t Hash = 5381;
+    for (const auto &Value : CallStack) {
+      Hash = HashCombine(Hash, Value);
+    }
+    for (const auto &Entry : LBRStack) {
+      Hash = HashCombine(Hash, Entry.Source);
+      Hash = HashCombine(Hash, Entry.Target);
+    }
+    return Hash;
   }
 
-  // Used for sample aggregation
-  bool isEqual(const PerfSample *K) const override {
-    const HybridSample *Other = dyn_cast<HybridSample>(K);
-    if (Other->Binary != Binary)
-      return false;
+  bool isEqual(const PerfSample *Other) const {
     const SmallVector<uint64_t, 16> &OtherCallStack = Other->CallStack;
     const SmallVector<LBREntry, 16> &OtherLBRStack = Other->LBRStack;
 
@@ -180,11 +170,8 @@ struct HybridSample : public PerfSample {
         LBRStack.size() != OtherLBRStack.size())
       return false;
 
-    auto Iter = CallStack.begin();
-    for (auto Address : OtherCallStack) {
-      if (Address != *Iter++)
-        return false;
-    }
+    if (!std::equal(CallStack.begin(), CallStack.end(), OtherCallStack.begin()))
+      return false;
 
     for (size_t I = 0; I < OtherLBRStack.size(); I++) {
       if (LBRStack[I].Source != OtherLBRStack[I].Source ||
@@ -194,23 +181,6 @@ struct HybridSample : public PerfSample {
     return true;
   }
 
-  void genHashCode() {
-    // Use simple DJB2 hash
-    auto HashCombine = [](uint64_t H, uint64_t V) {
-      return ((H << 5) + H) + V;
-    };
-    uint64_t Hash = 5381;
-    Hash = HashCombine(Hash, reinterpret_cast<uint64_t>(Binary));
-    for (const auto &Value : CallStack) {
-      Hash = HashCombine(Hash, Value);
-    }
-    for (const auto &Entry : LBRStack) {
-      Hash = HashCombine(Hash, Entry.Source);
-      Hash = HashCombine(Hash, Entry.Target);
-    }
-    HashCode = Hash;
-  }
-
 #ifndef NDEBUG
   void print() const {
     dbgs() << "LBR stack\n";
@@ -220,7 +190,6 @@ struct HybridSample : public PerfSample {
   }
 #endif
 };
-
 // After parsing the sample, we record the samples by aggregating them
 // into this counter. The key stores the sample data and the value is
 // the sample repeat times.
@@ -265,13 +234,13 @@ struct UnwindState {
   ProfiledFrame *CurrentLeafFrame;
   // Used to fall through the LBR stack
   uint32_t LBRIndex = 0;
-  // Reference to HybridSample.LBRStack
+  // Reference to PerfSample.LBRStack
   const SmallVector<LBREntry, 16> &LBRStack;
   // Used to iterate the address range
   InstructionPointer InstPtr;
-  UnwindState(const HybridSample *Sample)
-      : Binary(Sample->Binary), LBRStack(Sample->LBRStack),
-        InstPtr(Sample->Binary, Sample->CallStack.front()) {
+  UnwindState(const PerfSample *Sample, const ProfiledBinary *Binary)
+      : Binary(Binary), LBRStack(Sample->LBRStack),
+        InstPtr(Binary, Sample->CallStack.front()) {
     initFrameTrie(Sample->CallStack);
   }
 
@@ -494,7 +463,7 @@ class VirtualUnwinder {
 public:
   VirtualUnwinder(ContextSampleCounterMap *Counter, const ProfiledBinary *B)
       : CtxCounterMap(Counter), Binary(B) {}
-  bool unwind(const HybridSample *Sample, uint64_t Repeat);
+  bool unwind(const PerfSample *Sample, uint64_t Repeat);
 
 private:
   bool isCallState(UnwindState &State) const {
@@ -543,13 +512,16 @@ class PerfReaderBase {
   create(ProfiledBinary *Binary, cl::list<std::string> &PerfTraceFilenames);
 
   // A LBR sample is like:
-  // 0x5c6313f/0x5c63170/P/-/-/0  0x5c630e7/0x5c63130/P/-/-/0 ...
+  // 40062f 0x5c6313f/0x5c63170/P/-/-/0  0x5c630e7/0x5c63130/P/-/-/0 ...
   // A heuristic for fast detection by checking whether a
   // leading "  0x" and the '/' exist.
   static bool isLBRSample(StringRef Line) {
-    if (!Line.startswith(" 0x"))
+    // Skip the leading instruction pointer
+    SmallVector<StringRef, 32> Records;
+    Line.trim().split(Records, " ", 2, false);
+    if (Records.size() < 2)
       return false;
-    if (Line.find('/') != StringRef::npos)
+    if (Records[1].startswith("0x") && Records[1].find('/') != StringRef::npos)
       return true;
     return false;
   }
@@ -568,6 +540,11 @@ class PerfReaderBase {
     TraceStream TraceIt(FileName);
     uint64_t FrameAddr = 0;
     while (!TraceIt.isAtEoF()) {
+      // Skip the aggregated count
+      if (!TraceIt.getCurrentLine().getAsInteger(10, FrameAddr))
+        TraceIt.advance();
+
+      // Detect sample with call stack
       int32_t Count = 0;
       while (!TraceIt.isAtEoF() &&
              !TraceIt.getCurrentLine().ltrim().getAsInteger(16, FrameAddr)) {
@@ -615,6 +592,8 @@ class PerfReaderBase {
   void parseAndAggregateTrace(StringRef Filename);
   // Parse either an MMAP event or a perf sample
   void parseEventOrSample(TraceStream &TraceIt);
+  // Warn if the relevant mmap event is missing.
+  void warnIfMissingMMap();
   // Extract call stack from the perf trace lines
   bool extractCallstack(TraceStream &TraceIt,
                         SmallVectorImpl<uint64_t> &CallStack);
@@ -631,6 +610,8 @@ class PerfReaderBase {
   // Post process the profile after trace aggregation, we will do simple range
   // overlap computation for AutoFDO, or unwind for CSSPGO(hybrid sample).
   virtual void generateRawProfile() = 0;
+  virtual void writeRawProfile(StringRef Filename);
+  virtual void writeRawProfile(raw_fd_ostream &OS) = 0;
 
   ProfiledBinary *Binary = nullptr;
 
@@ -661,7 +642,34 @@ class HybridPerfReader : public PerfReaderBase {
 private:
   // Unwind the hybrid samples after aggregration
   void unwindSamples();
-  void printUnwinderOutput();
+  void writeRawProfile(raw_fd_ostream &OS) override;
+};
+
+/*
+  The reader of LBR only perf script.
+  A typical LBR sample is like:
+    40062f 0x4005c8/0x4005dc/P/-/-/0   0x40062f/0x4005b0/P/-/-/0 ...
+          ... 0x4005c8/0x4005dc/P/-/-/0
+*/
+class LBRPerfReader : public PerfReaderBase {
+public:
+  LBRPerfReader(ProfiledBinary *Binary) : PerfReaderBase(Binary) {
+    // There is no context for LBR only sample, so initialize one entry with
+    // fake "empty" context key.
+    std::shared_ptr<StringBasedCtxKey> Key =
+        std::make_shared<StringBasedCtxKey>();
+    Key->genHashCode();
+    SampleCounters.emplace(Hashable<ContextKey>(Key), SampleCounter());
+    PerfType = PERF_LBR;
+  };
+
+  // Parse the LBR only sample.
+  void parseSample(TraceStream &TraceIt, uint64_t Count) override;
+  void generateRawProfile() override;
+
+private:
+  void computeCounterFromLBR(const PerfSample *Sample, uint64_t Repeat);
+  void writeRawProfile(raw_fd_ostream &OS) override;
 };
 
 } // end namespace sampleprof

diff  --git a/llvm/tools/llvm-profgen/ProfileGenerator.cpp b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
index 139d12dc71c4c..6ea827cb1dab5 100644
--- a/llvm/tools/llvm-profgen/ProfileGenerator.cpp
+++ b/llvm/tools/llvm-profgen/ProfileGenerator.cpp
@@ -11,9 +11,9 @@
 #include "llvm/ProfileData/ProfileCommon.h"
 #include <unordered_set>
 
-static cl::opt<std::string> OutputFilename("output", cl::value_desc("output"),
-                                           cl::Required,
-                                           cl::desc("Output profile file"));
+cl::opt<std::string> OutputFilename("output", cl::value_desc("output"),
+                                    cl::Required,
+                                    cl::desc("Output profile file"));
 static cl::alias OutputA("o", cl::desc("Alias for --output"),
                          cl::aliasopt(OutputFilename));
 

diff  --git a/llvm/tools/llvm-profgen/ProfiledBinary.cpp b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
index b7ade7a543969..9d17d78582371 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.cpp
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
@@ -545,8 +545,8 @@ SampleContextFrameVector ProfiledBinary::symbolize(const InstructionPointer &IP,
   return CallStack;
 }
 
-InstructionPointer::InstructionPointer(ProfiledBinary *Binary, uint64_t Address,
-                                       bool RoundToNext)
+InstructionPointer::InstructionPointer(const ProfiledBinary *Binary,
+                                       uint64_t Address, bool RoundToNext)
     : Binary(Binary), Address(Address) {
   Index = Binary->getIndexForAddr(Address);
   if (RoundToNext) {

diff  --git a/llvm/tools/llvm-profgen/ProfiledBinary.h b/llvm/tools/llvm-profgen/ProfiledBinary.h
index 17b80d84f61ee..4a3bb561f488e 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.h
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.h
@@ -51,7 +51,7 @@ namespace sampleprof {
 class ProfiledBinary;
 
 struct InstructionPointer {
-  ProfiledBinary *Binary;
+  const ProfiledBinary *Binary;
   union {
     // Offset of the executable segment of the binary.
     uint64_t Offset = 0;
@@ -60,7 +60,7 @@ struct InstructionPointer {
   };
   // Index to the sorted code address array of the binary.
   uint64_t Index = 0;
-  InstructionPointer(ProfiledBinary *Binary, uint64_t Address,
+  InstructionPointer(const ProfiledBinary *Binary, uint64_t Address,
                      bool RoundToNext = false);
   void advance();
   void backward();

diff  --git a/llvm/tools/llvm-profgen/llvm-profgen.cpp b/llvm/tools/llvm-profgen/llvm-profgen.cpp
index 4bc2ea8fc3ad5..f4e063ff60198 100644
--- a/llvm/tools/llvm-profgen/llvm-profgen.cpp
+++ b/llvm/tools/llvm-profgen/llvm-profgen.cpp
@@ -35,6 +35,7 @@ static cl::opt<std::string> BinaryPath(
 
 extern cl::opt<bool> ShowDisassemblyOnly;
 extern cl::opt<bool> ShowSourceLocations;
+extern cl::opt<bool> SkipSymbolization;
 
 using namespace llvm;
 using namespace sampleprof;
@@ -89,6 +90,15 @@ int main(int argc, const char *argv[]) {
       PerfReaderBase::create(Binary.get(), PerfTraceFilenames);
   Reader->parsePerfTraces(PerfTraceFilenames);
 
+  if (SkipSymbolization)
+    return EXIT_SUCCESS;
+
+  // TBD
+  if (Reader->getPerfScriptType() == PERF_LBR) {
+    WithColor::warning() << "Currently LBR only perf script is not supported!";
+    return EXIT_SUCCESS;
+  }
+
   std::unique_ptr<ProfileGenerator> Generator = ProfileGenerator::create(
       Binary.get(), Reader->getSampleCounters(), Reader->getPerfScriptType());
   Generator->generateProfile();


        


More information about the llvm-commits mailing list