[clang] ab3430f - [Profile] Add binary profile correlation for code coverage. (#69493)

via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 14 11:16:42 PST 2023


Author: Zequan Wu
Date: 2023-12-14T14:16:38-05:00
New Revision: ab3430f891cf508e2b5c4796789998561d543df4

URL: https://github.com/llvm/llvm-project/commit/ab3430f891cf508e2b5c4796789998561d543df4
DIFF: https://github.com/llvm/llvm-project/commit/ab3430f891cf508e2b5c4796789998561d543df4.diff

LOG: [Profile] Add binary profile correlation for code coverage. (#69493)

## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.

Added: 
    compiler-rt/test/profile/instrprof-binary-correlate.c

Modified: 
    clang/lib/CodeGen/BackendUtil.cpp
    compiler-rt/include/profile/InstrProfData.inc
    compiler-rt/lib/profile/InstrProfilingPlatformWindows.c
    compiler-rt/test/CMakeLists.txt
    llvm/docs/CommandGuide/llvm-profdata.rst
    llvm/include/llvm/ProfileData/InstrProf.h
    llvm/include/llvm/ProfileData/InstrProfCorrelator.h
    llvm/include/llvm/ProfileData/InstrProfData.inc
    llvm/include/llvm/ProfileData/InstrProfReader.h
    llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
    llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
    llvm/lib/ProfileData/InstrProf.cpp
    llvm/lib/ProfileData/InstrProfCorrelator.cpp
    llvm/lib/ProfileData/InstrProfReader.cpp
    llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
    llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
    llvm/test/Instrumentation/InstrProfiling/coverage.ll
    llvm/tools/llvm-profdata/llvm-profdata.cpp

Removed: 
    


################################################################################
diff  --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 77455c075cab0d..7d16de33763a0d 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -45,6 +45,7 @@
 #include "llvm/Passes/PassBuilder.h"
 #include "llvm/Passes/PassPlugin.h"
 #include "llvm/Passes/StandardInstrumentations.h"
+#include "llvm/ProfileData/InstrProfCorrelator.h"
 #include "llvm/Support/BuryPointer.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/MemoryBuffer.h"
@@ -102,17 +103,21 @@ static cl::opt<bool> ClSanitizeOnOptimizerEarlyEP(
     "sanitizer-early-opt-ep", cl::Optional,
     cl::desc("Insert sanitizers on OptimizerEarlyEP."), cl::init(false));
 
+extern cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate;
+
 // Re-link builtin bitcodes after optimization
 cl::opt<bool> ClRelinkBuiltinBitcodePostop(
     "relink-builtin-bitcode-postop", cl::Optional,
     cl::desc("Re-link builtin bitcodes after optimization."), cl::init(false));
-}
+} // namespace llvm
 
 namespace {
 
 // Default filename used for profile generation.
 std::string getDefaultProfileGenName() {
-  return DebugInfoCorrelate ? "default_%m.proflite" : "default_%m.profraw";
+  return DebugInfoCorrelate || ProfileCorrelate != InstrProfCorrelator::NONE
+             ? "default_%m.proflite"
+             : "default_%m.profraw";
 }
 
 class EmitAssemblyHelper {
@@ -204,7 +209,7 @@ class EmitAssemblyHelper {
   void EmitAssembly(BackendAction Action, std::unique_ptr<raw_pwrite_stream> OS,
                     BackendConsumer *BC);
 };
-}
+} // namespace
 
 static SanitizerCoverageOptions
 getSancovOptsFromCGOpts(const CodeGenOptions &CGOpts) {

diff  --git a/compiler-rt/include/profile/InstrProfData.inc b/compiler-rt/include/profile/InstrProfData.inc
index 44a449800923fd..f5de23ff4b94d9 100644
--- a/compiler-rt/include/profile/InstrProfData.inc
+++ b/compiler-rt/include/profile/InstrProfData.inc
@@ -295,6 +295,12 @@ INSTR_PROF_SECT_ENTRY(IPSK_covfun, \
 INSTR_PROF_SECT_ENTRY(IPSK_orderfile, \
                       INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COMMON), \
                       INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COFF), "__DATA,")
+INSTR_PROF_SECT_ENTRY(IPSK_covdata, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_COVDATA_COMMON), \
+                      INSTR_PROF_COVDATA_COFF, "__LLVM_COV,")
+INSTR_PROF_SECT_ENTRY(IPSK_covname, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_COVNAME_COMMON), \
+                      INSTR_PROF_COVNAME_COFF, "__LLVM_COV,")
 
 #undef INSTR_PROF_SECT_ENTRY
 #endif
@@ -701,6 +707,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_COMMON __llvm_prf_vnds
 #define INSTR_PROF_COVMAP_COMMON __llvm_covmap
 #define INSTR_PROF_COVFUN_COMMON __llvm_covfun
+#define INSTR_PROF_COVDATA_COMMON __llvm_covdata
+#define INSTR_PROF_COVNAME_COMMON __llvm_covnames
 #define INSTR_PROF_ORDERFILE_COMMON __llvm_orderfile
 /* Windows section names. Because these section names contain dollar characters,
  * they must be quoted.
@@ -713,6 +721,11 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_COFF ".lprfnd$M"
 #define INSTR_PROF_COVMAP_COFF ".lcovmap$M"
 #define INSTR_PROF_COVFUN_COFF ".lcovfun$M"
+/* Since cov data and cov names sections are not allocated, we don't need to
+ * access them at runtime.
+ */
+#define INSTR_PROF_COVDATA_COFF ".lcovd"
+#define INSTR_PROF_COVNAME_COFF ".lcovn"
 #define INSTR_PROF_ORDERFILE_COFF ".lorderfile$M"
 
 #ifdef _WIN32
@@ -729,6 +742,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_SECT_NAME INSTR_PROF_VNODES_COFF
 #define INSTR_PROF_COVMAP_SECT_NAME INSTR_PROF_COVMAP_COFF
 #define INSTR_PROF_COVFUN_SECT_NAME INSTR_PROF_COVFUN_COFF
+#define INSTR_PROF_COVDATA_SECT_NAME INSTR_PROF_COVDATA_COFF
+#define INSTR_PROF_COVNAME_SECT_NAME INSTR_PROF_COVNAME_COFF
 #define INSTR_PROF_ORDERFILE_SECT_NAME INSTR_PROF_ORDERFILE_COFF
 #else
 /* Runtime section names and name strings.  */
@@ -744,6 +759,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_VNODES_COMMON)
 #define INSTR_PROF_COVMAP_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVMAP_COMMON)
 #define INSTR_PROF_COVFUN_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVFUN_COMMON)
+#define INSTR_PROF_COVDATA_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVDATA_COMMON)
+#define INSTR_PROF_COVNAME_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVNAME_COMMON)
 /* Order file instrumentation. */
 #define INSTR_PROF_ORDERFILE_SECT_NAME                                         \
   INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COMMON)

diff  --git a/compiler-rt/lib/profile/InstrProfilingPlatformWindows.c b/compiler-rt/lib/profile/InstrProfilingPlatformWindows.c
index 9dbd702865fd29..9070b8a606eb54 100644
--- a/compiler-rt/lib/profile/InstrProfilingPlatformWindows.c
+++ b/compiler-rt/lib/profile/InstrProfilingPlatformWindows.c
@@ -13,13 +13,14 @@
 
 #if defined(_MSC_VER)
 /* Merge read-write sections into .data. */
-#pragma comment(linker, "/MERGE:.lprfc=.data")
 #pragma comment(linker, "/MERGE:.lprfb=.data")
 #pragma comment(linker, "/MERGE:.lprfd=.data")
 #pragma comment(linker, "/MERGE:.lprfv=.data")
 #pragma comment(linker, "/MERGE:.lprfnd=.data")
 /* Do *NOT* merge .lprfn and .lcovmap into .rdata. llvm-cov must be able to find
  * after the fact.
+ * Do *NOT* merge .lprfc .rdata. When binary profile correlation is enabled,
+ * llvm-cov must be able to find after the fact.
  */
 
 /* Allocate read-only section bounds. */

diff  --git a/compiler-rt/test/CMakeLists.txt b/compiler-rt/test/CMakeLists.txt
index f9b01b15b0e62c..7357604b1f651e 100644
--- a/compiler-rt/test/CMakeLists.txt
+++ b/compiler-rt/test/CMakeLists.txt
@@ -37,8 +37,9 @@ if(NOT ANDROID)
   if(NOT COMPILER_RT_STANDALONE_BUILD AND NOT LLVM_RUNTIMES_BUILD)
     # Use LLVM utils and Clang from the same build tree.
     list(APPEND SANITIZER_COMMON_LIT_TEST_DEPS
-      clang clang-resource-headers FileCheck count not llvm-config llvm-nm llvm-objdump
-      llvm-readelf llvm-readobj llvm-size llvm-symbolizer compiler-rt-headers sancov split-file)
+      clang clang-resource-headers FileCheck count not llvm-config llvm-nm 
+      llvm-objdump llvm-readelf llvm-readobj llvm-size llvm-symbolizer 
+      compiler-rt-headers sancov split-file llvm-strip)
     if (WIN32)
       list(APPEND SANITIZER_COMMON_LIT_TEST_DEPS KillTheDoctor)
     endif()

diff  --git a/compiler-rt/test/profile/instrprof-binary-correlate.c b/compiler-rt/test/profile/instrprof-binary-correlate.c
new file mode 100644
index 00000000000000..8f421014cf5c9f
--- /dev/null
+++ b/compiler-rt/test/profile/instrprof-binary-correlate.c
@@ -0,0 +1,46 @@
+// REQUIRES: linux || windows
+// Default
+// RUN: %clang -o %t.normal -fprofile-instr-generate -fcoverage-mapping %S/Inputs/instrprof-debug-info-correlate-main.cpp %S/Inputs/instrprof-debug-info-correlate-foo.cpp
+// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t.normal
+// RUN: llvm-profdata merge -o %t.normal.profdata %t.profraw
+// RUN: llvm-cov report --instr-profile=%t.normal.profdata %t.normal > %t.normal.report
+// RUN: llvm-cov show --instr-profile=%t.normal.profdata %t.normal > %t.normal.show
+
+// With -profile-correlate=binary flag
+// RUN: %clang -o %t-1.exe -fprofile-instr-generate -fcoverage-mapping -mllvm -profile-correlate=binary %S/Inputs/instrprof-debug-info-correlate-main.cpp %S/Inputs/instrprof-debug-info-correlate-foo.cpp
+// RUN: env LLVM_PROFILE_FILE=%t-1.profraw %run %t-1.exe
+// RUN: llvm-profdata merge -o %t-1.profdata --binary-file=%t-1.exe %t-1.profraw
+// RUN: llvm-cov report --instr-profile=%t-1.profdata %t-1.exe > %t-1.report
+// RUN: llvm-cov show --instr-profile=%t-1.profdata %t-1.exe > %t-1.show
+// RUN: 
diff  %t.normal.profdata %t-1.profdata
+// RUN: 
diff  %t.normal.report %t-1.report
+// RUN: 
diff  %t.normal.show %t-1.show
+
+// Strip above binary and run
+// RUN: llvm-strip %t-1.exe -o %t-2.exe
+// RUN: env LLVM_PROFILE_FILE=%t-2.profraw %run %t-2.exe
+// RUN: llvm-profdata merge -o %t-2.profdata --binary-file=%t-1.exe %t-2.profraw
+// RUN: llvm-cov report --instr-profile=%t-2.profdata %t-1.exe > %t-2.report
+// RUN: llvm-cov show --instr-profile=%t-2.profdata %t-1.exe > %t-2.show
+// RUN: 
diff  %t.normal.profdata %t-2.profdata
+// RUN: 
diff  %t.normal.report %t-2.report
+// RUN: 
diff  %t.normal.show %t-2.show
+
+// Online merging.
+// RUN: env LLVM_PROFILE_FILE=%t-3.profraw %run %t.normal
+// RUN: env LLVM_PROFILE_FILE=%t-4.profraw %run %t.normal
+// RUN: llvm-profdata merge -o %t.normal.merged.profdata %t-3.profraw %t-4.profraw
+// RUN: llvm-cov report --instr-profile=%t.normal.merged.profdata %t.normal > %t.normal.merged.report
+// RUN: llvm-cov show --instr-profile=%t.normal.merged.profdata %t.normal > %t.normal.merged.show
+
+// RUN: rm -rf %t.profdir && mkdir %t.profdir
+// RUN: env LLVM_PROFILE_FILE=%t.profdir/%m-4.profraw %run %t-2.exe
+// RUN: env LLVM_PROFILE_FILE=%t.profdir/%m-4.profraw %run %t-2.exe
+// RUN: llvm-profdata merge -o %t-4.profdata --binary-file=%t-1.exe  %t.profdir
+// RUN: llvm-cov report --instr-profile=%t-4.profdata %t-1.exe > %t-4.report
+// RUN: llvm-cov show --instr-profile=%t-4.profdata %t-1.exe > %t-4.show
+// RUN: 
diff  %t.normal.merged.profdata %t-4.profdata
+// RUN: 
diff  %t.normal.merged.report %t-4.report
+// RUN: 
diff  %t.normal.merged.show %t-4.show
+
+// TODO: After adding support for binary ID, test binaries with 
diff erent binary IDs.

diff  --git a/llvm/docs/CommandGuide/llvm-profdata.rst b/llvm/docs/CommandGuide/llvm-profdata.rst
index be42733ca14056..f5e3c13ffbc8e6 100644
--- a/llvm/docs/CommandGuide/llvm-profdata.rst
+++ b/llvm/docs/CommandGuide/llvm-profdata.rst
@@ -195,8 +195,14 @@ OPTIONS
 .. option:: --debug-info=<path>
 
  Specify the executable or ``.dSYM`` that contains debug info for the raw profile.
- When ``-debug-info-correlate`` was used for instrumentation, use this option
- to correlate the raw profile.
+ When ``--debug-info-correlate`` or ``--profile-correlate=debug-info`` was used 
+ for instrumentation, use this option to correlate the raw profile.
+
+.. option:: --binary-file=<path>
+
+ Specify the executable that contains profile data and profile name sections for
+ the raw profile. When ``-profile-correlate=binary`` was used for
+ instrumentation, use this option to correlate the raw profile.
 
 .. option:: --temporal-profile-trace-reservoir-size
 
@@ -346,8 +352,9 @@ OPTIONS
 .. option:: --debug-info=<path>
 
  Specify the executable or ``.dSYM`` that contains debug info for the raw profile.
- When ``-debug-info-correlate`` was used for instrumentation, use this option
- to show the correlated functions from the raw profile.
+ When ``--debug-info-correlate`` or ``--profile-correlate=debug-info`` was used
+ for instrumentation, use this option to show the correlated functions from the
+ raw profile.
 
 .. option:: --covered
 

diff  --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index 3bc677d5b6d867..288dc71d756aee 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -328,8 +328,8 @@ enum class instrprof_error {
   too_large,
   truncated,
   malformed,
-  missing_debug_info_for_correlation,
-  unexpected_debug_info_for_correlation,
+  missing_correlation_info,
+  unexpected_correlation_info,
   unable_to_correlate_profile,
   unknown_function,
   invalid_prof,

diff  --git a/llvm/include/llvm/ProfileData/InstrProfCorrelator.h b/llvm/include/llvm/ProfileData/InstrProfCorrelator.h
index a3a0805a294a20..c07c67d287e2ce 100644
--- a/llvm/include/llvm/ProfileData/InstrProfCorrelator.h
+++ b/llvm/include/llvm/ProfileData/InstrProfCorrelator.h
@@ -5,8 +5,8 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 //===----------------------------------------------------------------------===//
-// This file defines InstrProfCorrelator used to generate PGO profiles from
-// raw profile data and debug info.
+// This file defines InstrProfCorrelator used to generate PGO/coverage profiles
+// from raw profile data and debug info/binary file.
 //===----------------------------------------------------------------------===//
 
 #ifndef LLVM_PROFILEDATA_INSTRPROFCORRELATOR_H
@@ -31,8 +31,9 @@ class ObjectFile;
 /// to their functions.
 class InstrProfCorrelator {
 public:
-  /// Indicate which kind correlator to use.
-  enum ProfCorrelatorKind { NONE, DEBUG_INFO };
+  /// Indicate if we should use the debug info or profile metadata sections to
+  /// correlate.
+  enum ProfCorrelatorKind { NONE, DEBUG_INFO, BINARY };
 
   static llvm::Expected<std::unique_ptr<InstrProfCorrelator>>
   get(StringRef Filename, ProfCorrelatorKind FileKind);
@@ -71,11 +72,18 @@ class InstrProfCorrelator {
 protected:
   struct Context {
     static llvm::Expected<std::unique_ptr<Context>>
-    get(std::unique_ptr<MemoryBuffer> Buffer, const object::ObjectFile &Obj);
+    get(std::unique_ptr<MemoryBuffer> Buffer, const object::ObjectFile &Obj,
+        ProfCorrelatorKind FileKind);
     std::unique_ptr<MemoryBuffer> Buffer;
     /// The address range of the __llvm_prf_cnts section.
     uint64_t CountersSectionStart;
     uint64_t CountersSectionEnd;
+    /// The pointer points to start/end of profile data/name sections if
+    /// FileKind is Binary.
+    const char *DataStart;
+    const char *DataEnd;
+    const char *NameStart;
+    size_t NameSize;
     /// True if target and host have 
diff erent endian orders.
     bool ShouldSwapBytes;
   };
@@ -145,19 +153,20 @@ class InstrProfCorrelatorImpl : public InstrProfCorrelator {
 
   Error dumpYaml(int MaxWarnings, raw_ostream &OS) override;
 
-  void addProbe(StringRef FunctionName, uint64_t CFGHash, IntPtrT CounterOffset,
-                IntPtrT FunctionPtr, uint32_t NumCounters);
+  void addDataProbe(uint64_t FunctionName, uint64_t CFGHash,
+                    IntPtrT CounterOffset, IntPtrT FunctionPtr,
+                    uint32_t NumCounters);
+
+  // Byte-swap the value if necessary.
+  template <class T> T maybeSwap(T Value) const {
+    return Ctx->ShouldSwapBytes ? llvm::byteswap(Value) : Value;
+  }
 
 private:
   InstrProfCorrelatorImpl(InstrProfCorrelatorKind Kind,
                           std::unique_ptr<InstrProfCorrelator::Context> Ctx)
       : InstrProfCorrelator(Kind, std::move(Ctx)){};
   llvm::DenseSet<IntPtrT> CounterOffsets;
-
-  // Byte-swap the value if necessary.
-  template <class T> T maybeSwap(T Value) const {
-    return Ctx->ShouldSwapBytes ? llvm::byteswap(Value) : Value;
-  }
 };
 
 /// DwarfInstrProfCorrelator - A child of InstrProfCorrelatorImpl that takes
@@ -214,6 +223,28 @@ class DwarfInstrProfCorrelator : public InstrProfCorrelatorImpl<IntPtrT> {
   Error correlateProfileNameImpl() override;
 };
 
+/// BinaryInstrProfCorrelator - A child of InstrProfCorrelatorImpl that
+/// takes an object file as input to correlate profiles.
+template <class IntPtrT>
+class BinaryInstrProfCorrelator : public InstrProfCorrelatorImpl<IntPtrT> {
+public:
+  BinaryInstrProfCorrelator(std::unique_ptr<InstrProfCorrelator::Context> Ctx)
+      : InstrProfCorrelatorImpl<IntPtrT>(std::move(Ctx)) {}
+
+  /// Return a pointer to the names string that this class constructs.
+  const char *getNamesPointer() const { return this->Ctx.NameStart; }
+
+  /// Return the number of bytes in the names string.
+  size_t getNamesSize() const { return this->Ctx.NameSize; }
+
+private:
+  void correlateProfileDataImpl(
+      int MaxWarnings,
+      InstrProfCorrelator::CorrelationData *Data = nullptr) override;
+
+  Error correlateProfileNameImpl() override;
+};
+
 } // end namespace llvm
 
 #endif // LLVM_PROFILEDATA_INSTRPROFCORRELATOR_H

diff  --git a/llvm/include/llvm/ProfileData/InstrProfData.inc b/llvm/include/llvm/ProfileData/InstrProfData.inc
index 44a449800923fd..f5de23ff4b94d9 100644
--- a/llvm/include/llvm/ProfileData/InstrProfData.inc
+++ b/llvm/include/llvm/ProfileData/InstrProfData.inc
@@ -295,6 +295,12 @@ INSTR_PROF_SECT_ENTRY(IPSK_covfun, \
 INSTR_PROF_SECT_ENTRY(IPSK_orderfile, \
                       INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COMMON), \
                       INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COFF), "__DATA,")
+INSTR_PROF_SECT_ENTRY(IPSK_covdata, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_COVDATA_COMMON), \
+                      INSTR_PROF_COVDATA_COFF, "__LLVM_COV,")
+INSTR_PROF_SECT_ENTRY(IPSK_covname, \
+                      INSTR_PROF_QUOTE(INSTR_PROF_COVNAME_COMMON), \
+                      INSTR_PROF_COVNAME_COFF, "__LLVM_COV,")
 
 #undef INSTR_PROF_SECT_ENTRY
 #endif
@@ -701,6 +707,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_COMMON __llvm_prf_vnds
 #define INSTR_PROF_COVMAP_COMMON __llvm_covmap
 #define INSTR_PROF_COVFUN_COMMON __llvm_covfun
+#define INSTR_PROF_COVDATA_COMMON __llvm_covdata
+#define INSTR_PROF_COVNAME_COMMON __llvm_covnames
 #define INSTR_PROF_ORDERFILE_COMMON __llvm_orderfile
 /* Windows section names. Because these section names contain dollar characters,
  * they must be quoted.
@@ -713,6 +721,11 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_COFF ".lprfnd$M"
 #define INSTR_PROF_COVMAP_COFF ".lcovmap$M"
 #define INSTR_PROF_COVFUN_COFF ".lcovfun$M"
+/* Since cov data and cov names sections are not allocated, we don't need to
+ * access them at runtime.
+ */
+#define INSTR_PROF_COVDATA_COFF ".lcovd"
+#define INSTR_PROF_COVNAME_COFF ".lcovn"
 #define INSTR_PROF_ORDERFILE_COFF ".lorderfile$M"
 
 #ifdef _WIN32
@@ -729,6 +742,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_SECT_NAME INSTR_PROF_VNODES_COFF
 #define INSTR_PROF_COVMAP_SECT_NAME INSTR_PROF_COVMAP_COFF
 #define INSTR_PROF_COVFUN_SECT_NAME INSTR_PROF_COVFUN_COFF
+#define INSTR_PROF_COVDATA_SECT_NAME INSTR_PROF_COVDATA_COFF
+#define INSTR_PROF_COVNAME_SECT_NAME INSTR_PROF_COVNAME_COFF
 #define INSTR_PROF_ORDERFILE_SECT_NAME INSTR_PROF_ORDERFILE_COFF
 #else
 /* Runtime section names and name strings.  */
@@ -744,6 +759,8 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
 #define INSTR_PROF_VNODES_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_VNODES_COMMON)
 #define INSTR_PROF_COVMAP_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVMAP_COMMON)
 #define INSTR_PROF_COVFUN_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVFUN_COMMON)
+#define INSTR_PROF_COVDATA_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVDATA_COMMON)
+#define INSTR_PROF_COVNAME_SECT_NAME INSTR_PROF_QUOTE(INSTR_PROF_COVNAME_COMMON)
 /* Order file instrumentation. */
 #define INSTR_PROF_ORDERFILE_SECT_NAME                                         \
   INSTR_PROF_QUOTE(INSTR_PROF_ORDERFILE_COMMON)

diff  --git a/llvm/include/llvm/ProfileData/InstrProfReader.h b/llvm/include/llvm/ProfileData/InstrProfReader.h
index 952cc0d0dc80b9..ff50dfde0e7938 100644
--- a/llvm/include/llvm/ProfileData/InstrProfReader.h
+++ b/llvm/include/llvm/ProfileData/InstrProfReader.h
@@ -123,9 +123,6 @@ class InstrProfReader {
 
   virtual bool instrEntryBBEnabled() const = 0;
 
-  /// Return true if we must provide debug info to create PGO profiles.
-  virtual bool useDebugInfoCorrelate() const { return false; }
-
   /// Return true if the profile has single byte counters representing coverage.
   virtual bool hasSingleByteCoverage() const = 0;
 
@@ -378,12 +375,6 @@ class RawInstrProfReader : public InstrProfReader {
     return (Version & VARIANT_MASK_INSTR_ENTRY) != 0;
   }
 
-  bool useDebugInfoCorrelate() const override {
-    return (Version & VARIANT_MASK_DBG_CORRELATE) != 0;
-  }
-
-  bool useCorrelate() const { return useDebugInfoCorrelate(); }
-
   bool hasSingleByteCoverage() const override {
     return (Version & VARIANT_MASK_BYTE_COVERAGE) != 0;
   }

diff  --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index 18f4d546aa82d7..9a0dd92bb58e87 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -472,6 +472,10 @@ static SectionKind getELFKindForNamedSection(StringRef Name, SectionKind K) {
                                       /*AddSegmentInfo=*/false) ||
       Name == getInstrProfSectionName(IPSK_covfun, Triple::ELF,
                                       /*AddSegmentInfo=*/false) ||
+      Name == getInstrProfSectionName(IPSK_covdata, Triple::ELF,
+                                      /*AddSegmentInfo=*/false) ||
+      Name == getInstrProfSectionName(IPSK_covname, Triple::ELF,
+                                      /*AddSegmentInfo=*/false) ||
       Name == ".llvmbc" || Name == ".llvmcmd")
     return SectionKind::getMetadata();
 

diff  --git a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
index 56a7ab24a2f08a..ac8e6b56379f21 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
@@ -493,9 +493,13 @@ Error InstrProfSymtab::create(SectionRef &Section) {
 
   // If this is a linked PE/COFF file, then we have to skip over the null byte
   // that is allocated in the .lprfn$A section in the LLVM profiling runtime.
+  // If the name section is .lprfcovnames, it doesn't have the null byte at the
+  // beginning.
   const ObjectFile *Obj = Section.getObject();
   if (isa<COFFObjectFile>(Obj) && !Obj->isRelocatableObject())
-    Data = Data.drop_front(1);
+    if (Expected<StringRef> NameOrErr = Section.getName())
+      if (*NameOrErr != getInstrProfSectionName(IPSK_covname, Triple::COFF))
+        Data = Data.drop_front(1);
 
   return Error::success();
 }
@@ -1024,10 +1028,13 @@ loadTestingFormat(StringRef Data, StringRef CompilationDir) {
       BytesInAddress, Endian, CompilationDir);
 }
 
-/// Find all sections that match \p Name. There may be more than one if comdats
-/// are in use, e.g. for the __llvm_covfun section on ELF.
-static Expected<std::vector<SectionRef>> lookupSections(ObjectFile &OF,
-                                                        StringRef Name) {
+/// Find all sections that match \p IPSK name. There may be more than one if
+/// comdats are in use, e.g. for the __llvm_covfun section on ELF.
+static Expected<std::vector<SectionRef>>
+lookupSections(ObjectFile &OF, InstrProfSectKind IPSK) {
+  auto ObjFormat = OF.getTripleObjectFormat();
+  auto Name =
+      getInstrProfSectionName(IPSK, ObjFormat, /*AddSegmentInfo=*/false);
   // On COFF, the object file section name may end in "$M". This tells the
   // linker to sort these sections between "$A" and "$Z". The linker removes the
   // dollar and everything after it in the final binary. Do the same to match.
@@ -1042,8 +1049,13 @@ static Expected<std::vector<SectionRef>> lookupSections(ObjectFile &OF,
     Expected<StringRef> NameOrErr = Section.getName();
     if (!NameOrErr)
       return NameOrErr.takeError();
-    if (stripSuffix(*NameOrErr) == Name)
+    if (stripSuffix(*NameOrErr) == Name) {
+      // COFF profile name section contains two null bytes indicating the
+      // start/end of the section. If its size is 2 bytes, it's empty.
+      if (IsCOFF && IPSK == IPSK_name && Section.getSize() == 2)
+        continue;
       Sections.push_back(Section);
+    }
   }
   if (Sections.empty())
     return make_error<CoverageMapError>(coveragemap_error::no_data_found);
@@ -1079,15 +1091,27 @@ loadBinaryFormat(std::unique_ptr<Binary> Bin, StringRef Arch,
       OF->isLittleEndian() ? llvm::endianness::little : llvm::endianness::big;
 
   // Look for the sections that we are interested in.
-  auto ObjFormat = OF->getTripleObjectFormat();
-  auto NamesSection =
-      lookupSections(*OF, getInstrProfSectionName(IPSK_name, ObjFormat,
-                                                 /*AddSegmentInfo=*/false));
-  if (auto E = NamesSection.takeError())
+  InstrProfSymtab ProfileNames;
+  std::vector<SectionRef> NamesSectionRefs;
+  // If IPSK_name is not found, fallback to search for IPK_covname, which is
+  // used when binary correlation is enabled.
+  auto NamesSection = lookupSections(*OF, IPSK_name);
+  if (auto E = NamesSection.takeError()) {
+    consumeError(std::move(E));
+    NamesSection = lookupSections(*OF, IPSK_covname);
+    if (auto E = NamesSection.takeError())
+      return std::move(E);
+  }
+  NamesSectionRefs = *NamesSection;
+
+  if (NamesSectionRefs.size() != 1)
+    return make_error<CoverageMapError>(
+        coveragemap_error::malformed,
+        "the size of coverage mapping section is not one");
+  if (Error E = ProfileNames.create(NamesSectionRefs.back()))
     return std::move(E);
-  auto CoverageSection =
-      lookupSections(*OF, getInstrProfSectionName(IPSK_covmap, ObjFormat,
-                                                  /*AddSegmentInfo=*/false));
+
+  auto CoverageSection = lookupSections(*OF, IPSK_covmap);
   if (auto E = CoverageSection.takeError())
     return std::move(E);
   std::vector<SectionRef> CoverageSectionRefs = *CoverageSection;
@@ -1099,19 +1123,8 @@ loadBinaryFormat(std::unique_ptr<Binary> Bin, StringRef Arch,
     return CoverageMappingOrErr.takeError();
   StringRef CoverageMapping = CoverageMappingOrErr.get();
 
-  InstrProfSymtab ProfileNames;
-  std::vector<SectionRef> NamesSectionRefs = *NamesSection;
-  if (NamesSectionRefs.size() != 1)
-    return make_error<CoverageMapError>(
-        coveragemap_error::malformed,
-        "the size of coverage mapping section is not one");
-  if (Error E = ProfileNames.create(NamesSectionRefs.back()))
-    return std::move(E);
-
   // Look for the coverage records section (Version4 only).
-  auto CoverageRecordsSections =
-      lookupSections(*OF, getInstrProfSectionName(IPSK_covfun, ObjFormat,
-                                                  /*AddSegmentInfo=*/false));
+  auto CoverageRecordsSections = lookupSections(*OF, IPSK_covfun);
 
   BinaryCoverageReader::FuncRecordsStorage FuncRecords;
   if (auto E = CoverageRecordsSections.takeError()) {

diff  --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index a53e44d85d090c..649d814cfd9de0 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -113,11 +113,11 @@ static std::string getInstrProfErrString(instrprof_error Err,
   case instrprof_error::malformed:
     OS << "malformed instrumentation profile data";
     break;
-  case instrprof_error::missing_debug_info_for_correlation:
-    OS << "debug info for correlation is required";
+  case instrprof_error::missing_correlation_info:
+    OS << "debug info/binary for correlation is required";
     break;
-  case instrprof_error::unexpected_debug_info_for_correlation:
-    OS << "debug info for correlation is not necessary";
+  case instrprof_error::unexpected_correlation_info:
+    OS << "debug info/binary for correlation is not necessary";
     break;
   case instrprof_error::unable_to_correlate_profile:
     OS << "unable to correlate profile";

diff  --git a/llvm/lib/ProfileData/InstrProfCorrelator.cpp b/llvm/lib/ProfileData/InstrProfCorrelator.cpp
index 8529a179873524..cf80a58f43bd90 100644
--- a/llvm/lib/ProfileData/InstrProfCorrelator.cpp
+++ b/llvm/lib/ProfileData/InstrProfCorrelator.cpp
@@ -27,14 +27,22 @@ using namespace llvm;
 /// Get profile section.
 Expected<object::SectionRef> getInstrProfSection(const object::ObjectFile &Obj,
                                                  InstrProfSectKind IPSK) {
+  // On COFF, the getInstrProfSectionName returns the section names may followed
+  // by "$M". The linker removes the dollar and everything after it in the final
+  // binary. Do the same to match.
   Triple::ObjectFormatType ObjFormat = Obj.getTripleObjectFormat();
+  auto StripSuffix = [ObjFormat](StringRef N) {
+    return ObjFormat == Triple::COFF ? N.split('$').first : N;
+  };
   std::string ExpectedSectionName =
       getInstrProfSectionName(IPSK, ObjFormat,
                               /*AddSegmentInfo=*/false);
-  for (auto &Section : Obj.sections())
+  ExpectedSectionName = StripSuffix(ExpectedSectionName);
+  for (auto &Section : Obj.sections()) {
     if (auto SectionName = Section.getName())
-      if (SectionName.get() == ExpectedSectionName)
+      if (*SectionName == ExpectedSectionName)
         return Section;
+  }
   return make_error<InstrProfError>(
       instrprof_error::unable_to_correlate_profile,
       "could not find section (" + Twine(ExpectedSectionName) + ")");
@@ -46,14 +54,38 @@ const char *InstrProfCorrelator::NumCountersAttributeName = "Num Counters";
 
 llvm::Expected<std::unique_ptr<InstrProfCorrelator::Context>>
 InstrProfCorrelator::Context::get(std::unique_ptr<MemoryBuffer> Buffer,
-                                  const object::ObjectFile &Obj) {
+                                  const object::ObjectFile &Obj,
+                                  ProfCorrelatorKind FileKind) {
+  auto C = std::make_unique<Context>();
   auto CountersSection = getInstrProfSection(Obj, IPSK_cnts);
   if (auto Err = CountersSection.takeError())
     return std::move(Err);
-  auto C = std::make_unique<Context>();
+  if (FileKind == InstrProfCorrelator::BINARY) {
+    auto DataSection = getInstrProfSection(Obj, IPSK_covdata);
+    if (auto Err = DataSection.takeError())
+      return std::move(Err);
+    auto DataOrErr = DataSection->getContents();
+    if (!DataOrErr)
+      return DataOrErr.takeError();
+    auto NameSection = getInstrProfSection(Obj, IPSK_covname);
+    if (auto Err = NameSection.takeError())
+      return std::move(Err);
+    auto NameOrErr = NameSection->getContents();
+    if (!NameOrErr)
+      return NameOrErr.takeError();
+    C->DataStart = DataOrErr->data();
+    C->DataEnd = DataOrErr->data() + DataOrErr->size();
+    C->NameStart = NameOrErr->data();
+    C->NameSize = NameOrErr->size();
+  }
   C->Buffer = std::move(Buffer);
   C->CountersSectionStart = CountersSection->getAddress();
   C->CountersSectionEnd = C->CountersSectionStart + CountersSection->getSize();
+  // In COFF object file, there's a null byte at the beginning of the counter
+  // section which doesn't exist in raw profile.
+  if (Obj.getTripleObjectFormat() == Triple::COFF)
+    ++C->CountersSectionStart;
+
   C->ShouldSwapBytes = Obj.isLittleEndian() != sys::IsLittleEndianHost;
   return Expected<std::unique_ptr<Context>>(std::move(C));
 }
@@ -80,9 +112,17 @@ InstrProfCorrelator::get(StringRef Filename, ProfCorrelatorKind FileKind) {
 
     return get(std::move(*BufferOrErr), FileKind);
   }
+  if (FileKind == BINARY) {
+    auto BufferOrErr = errorOrToExpected(MemoryBuffer::getFile(Filename));
+    if (auto Err = BufferOrErr.takeError())
+      return std::move(Err);
+
+    return get(std::move(*BufferOrErr), FileKind);
+  }
   return make_error<InstrProfError>(
       instrprof_error::unable_to_correlate_profile,
-      "unsupported correlation kind (only DWARF debug info is supported)");
+      "unsupported correlation kind (only DWARF debug info and Binary format "
+      "(ELF/COFF) are supported)");
 }
 
 llvm::Expected<std::unique_ptr<InstrProfCorrelator>>
@@ -93,7 +133,7 @@ InstrProfCorrelator::get(std::unique_ptr<MemoryBuffer> Buffer,
     return std::move(Err);
 
   if (auto *Obj = dyn_cast<object::ObjectFile>(BinOrErr->get())) {
-    auto CtxOrErr = Context::get(std::move(Buffer), *Obj);
+    auto CtxOrErr = Context::get(std::move(Buffer), *Obj, FileKind);
     if (auto Err = CtxOrErr.takeError())
       return std::move(Err);
     auto T = Obj->makeTriple();
@@ -155,9 +195,11 @@ InstrProfCorrelatorImpl<IntPtrT>::get(
         instrprof_error::unable_to_correlate_profile,
         "unsupported debug info format (only DWARF is supported)");
   }
+  if (Obj.isELF() || Obj.isCOFF())
+    return std::make_unique<BinaryInstrProfCorrelator<IntPtrT>>(std::move(Ctx));
   return make_error<InstrProfError>(
       instrprof_error::unable_to_correlate_profile,
-      "unsupported correlation file type (only DWARF is supported)");
+      "unsupported binary format (only ELF and COFF are supported)");
 }
 
 template <class IntPtrT>
@@ -212,16 +254,16 @@ Error InstrProfCorrelatorImpl<IntPtrT>::dumpYaml(int MaxWarnings,
 }
 
 template <class IntPtrT>
-void InstrProfCorrelatorImpl<IntPtrT>::addProbe(StringRef FunctionName,
-                                                uint64_t CFGHash,
-                                                IntPtrT CounterOffset,
-                                                IntPtrT FunctionPtr,
-                                                uint32_t NumCounters) {
+void InstrProfCorrelatorImpl<IntPtrT>::addDataProbe(uint64_t NameRef,
+                                                    uint64_t CFGHash,
+                                                    IntPtrT CounterOffset,
+                                                    IntPtrT FunctionPtr,
+                                                    uint32_t NumCounters) {
   // Check if a probe was already added for this counter offset.
   if (!CounterOffsets.insert(CounterOffset).second)
     return;
   Data.push_back({
-      maybeSwap<uint64_t>(IndexedInstrProf::ComputeHash(FunctionName)),
+      maybeSwap<uint64_t>(NameRef),
       maybeSwap<uint64_t>(CFGHash),
       // In this mode, CounterPtr actually stores the section relative address
       // of the counter.
@@ -236,7 +278,6 @@ void InstrProfCorrelatorImpl<IntPtrT>::addProbe(StringRef FunctionName,
       // TODO: MC/DC is not yet supported.
       /*NumBitmapBytes=*/maybeSwap<uint32_t>(0),
   });
-  NamesVec.push_back(FunctionName.str());
 }
 
 template <class IntPtrT>
@@ -349,6 +390,8 @@ void DwarfInstrProfCorrelator<IntPtrT>::correlateProfileDataImpl(
                                      *FunctionName);
       LLVM_DEBUG(Die.dump(dbgs()));
     }
+    // In debug info correlation mode, the CounterPtr is an absolute address of
+    // the counter, but it's expected to be relative later when iterating Data.
     IntPtrT CounterOffset = *CounterPtr - CountersStart;
     if (Data) {
       InstrProfCorrelator::Probe P;
@@ -366,8 +409,9 @@ void DwarfInstrProfCorrelator<IntPtrT>::correlateProfileDataImpl(
         P.LineNumber = LineNumber;
       Data->Probes.push_back(P);
     } else {
-      this->addProbe(*FunctionName, *CFGHash, CounterOffset,
-                     FunctionPtr.value_or(0), *NumCounters);
+      this->addDataProbe(IndexedInstrProf::ComputeHash(*FunctionName), *CFGHash,
+                         CounterOffset, FunctionPtr.value_or(0), *NumCounters);
+      this->NamesVec.push_back(*FunctionName);
     }
   };
   for (auto &CU : DICtx->normal_units())
@@ -394,3 +438,46 @@ Error DwarfInstrProfCorrelator<IntPtrT>::correlateProfileNameImpl() {
                                      /*doCompression=*/false, this->Names);
   return Result;
 }
+
+template <class IntPtrT>
+void BinaryInstrProfCorrelator<IntPtrT>::correlateProfileDataImpl(
+    int MaxWarnings, InstrProfCorrelator::CorrelationData *CorrelateData) {
+  using RawProfData = RawInstrProf::ProfileData<IntPtrT>;
+  bool UnlimitedWarnings = (MaxWarnings == 0);
+  // -N suppressed warnings means we can emit up to N (unsuppressed) warnings
+  int NumSuppressedWarnings = -MaxWarnings;
+
+  const RawProfData *DataStart = (const RawProfData *)this->Ctx->DataStart;
+  const RawProfData *DataEnd = (const RawProfData *)this->Ctx->DataEnd;
+  // We need to use < here because the last data record may have no padding.
+  for (const RawProfData *I = DataStart; I < DataEnd; ++I) {
+    uint64_t CounterPtr = this->template maybeSwap<IntPtrT>(I->CounterPtr);
+    uint64_t CountersStart = this->Ctx->CountersSectionStart;
+    uint64_t CountersEnd = this->Ctx->CountersSectionEnd;
+    if (CounterPtr < CountersStart || CounterPtr >= CountersEnd) {
+      if (UnlimitedWarnings || ++NumSuppressedWarnings < 1) {
+        WithColor::warning()
+            << format("CounterPtr out of range for function: Actual=0x%x "
+                      "Expected=[0x%x, 0x%x) at data offset=0x%x\n",
+                      CounterPtr, CountersStart, CountersEnd,
+                      (I - DataStart) * sizeof(RawProfData));
+      }
+    }
+    // In binary correlation mode, the CounterPtr is an absolute address of the
+    // counter, but it's expected to be relative later when iterating Data.
+    IntPtrT CounterOffset = CounterPtr - CountersStart;
+    this->addDataProbe(I->NameRef, I->FuncHash, CounterOffset,
+                       I->FunctionPointer, I->NumCounters);
+  }
+}
+
+template <class IntPtrT>
+Error BinaryInstrProfCorrelator<IntPtrT>::correlateProfileNameImpl() {
+  if (this->Ctx->NameSize == 0) {
+    return make_error<InstrProfError>(
+        instrprof_error::unable_to_correlate_profile,
+        "could not find any profile data metadata in object file");
+  }
+  this->Names.append(this->Ctx->NameStart, this->Ctx->NameSize);
+  return Error::success();
+}

diff  --git a/llvm/lib/ProfileData/InstrProfReader.cpp b/llvm/lib/ProfileData/InstrProfReader.cpp
index f3fd456ae79730..068922d421f8b9 100644
--- a/llvm/lib/ProfileData/InstrProfReader.cpp
+++ b/llvm/lib/ProfileData/InstrProfReader.cpp
@@ -556,10 +556,6 @@ Error RawInstrProfReader<IntPtrT>::readHeader(
                   "\nPLEASE update this tool to version in the raw profile, or "
                   "regenerate raw profile with expected version.")
                      .str());
-  if (useCorrelate() && !Correlator)
-    return error(instrprof_error::missing_debug_info_for_correlation);
-  if (!useCorrelate() && Correlator)
-    return error(instrprof_error::unexpected_debug_info_for_correlation);
 
   uint64_t BinaryIdSize = swap(Header.BinaryIdsSize);
   // Binary id start just after the header if exists.
@@ -607,8 +603,9 @@ Error RawInstrProfReader<IntPtrT>::readHeader(
   if (Correlator) {
     // These sizes in the raw file are zero because we constructed them in the
     // Correlator.
-    assert(DataSize == 0 && NamesSize == 0);
-    assert(CountersDelta == 0 && NamesDelta == 0);
+    if (!(DataSize == 0 && NamesSize == 0 && CountersDelta == 0 &&
+          NamesDelta == 0))
+      return error(instrprof_error::unexpected_correlation_info);
     Data = Correlator->getDataPointer();
     DataEnd = Data + Correlator->getDataSize();
     NamesStart = Correlator->getNamesPointer();

diff  --git a/llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp b/llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
index 8674740001e580..fe5a0578bd9721 100644
--- a/llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
+++ b/llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
@@ -64,10 +64,24 @@ using namespace llvm;
 #define DEBUG_TYPE "instrprof"
 
 namespace llvm {
-cl::opt<bool>
-    DebugInfoCorrelate("debug-info-correlate",
-                       cl::desc("Use debug info to correlate profiles."),
-                       cl::init(false));
+// TODO: Remove -debug-info-correlate in next LLVM release, in favor of
+// -profile-correlate=debug-info.
+cl::opt<bool> DebugInfoCorrelate(
+    "debug-info-correlate",
+    cl::desc("Use debug info to correlate profiles. (Deprecated, use "
+             "-profile-correlate=debug-info)"),
+    cl::init(false));
+
+cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate(
+    "profile-correlate",
+    cl::desc("Use debug info or binary file to correlate profiles."),
+    cl::init(InstrProfCorrelator::NONE),
+    cl::values(clEnumValN(InstrProfCorrelator::NONE, "",
+                          "No profile correlation"),
+               clEnumValN(InstrProfCorrelator::DEBUG_INFO, "debug-info",
+                          "Use debug info to correlate"),
+               clEnumValN(InstrProfCorrelator::BINARY, "binary",
+                          "Use binary to correlate")));
 } // namespace llvm
 
 namespace {
@@ -792,7 +806,7 @@ void InstrLowerer::lowerValueProfileInst(InstrProfValueProfileInst *Ind) {
   // in lightweight mode. We need to move the value profile pointer to the
   // Counter struct to get this working.
   assert(
-      !DebugInfoCorrelate &&
+      !DebugInfoCorrelate && ProfileCorrelate == InstrProfCorrelator::NONE &&
       "Value profiling is not yet supported with lightweight instrumentation");
   GlobalVariable *Name = Ind->getName();
   auto It = ProfileDataMap.find(Name);
@@ -1219,8 +1233,9 @@ GlobalVariable *InstrLowerer::setupProfileSection(InstrProfInstBase *Inc,
 
   // Use internal rather than private linkage so the counter variable shows up
   // in the symbol table when using debug info for correlation.
-  if (DebugInfoCorrelate && TT.isOSBinFormatMachO() &&
-      Linkage == GlobalValue::PrivateLinkage)
+  if ((DebugInfoCorrelate ||
+       ProfileCorrelate == InstrProfCorrelator::DEBUG_INFO) &&
+      TT.isOSBinFormatMachO() && Linkage == GlobalValue::PrivateLinkage)
     Linkage = GlobalValue::InternalLinkage;
 
   // Due to the limitation of binder as of 2021/09/28, the duplicate weak
@@ -1341,7 +1356,8 @@ InstrLowerer::getOrCreateRegionCounters(InstrProfCntrInstBase *Inc) {
   auto *CounterPtr = setupProfileSection(Inc, IPSK_cnts);
   PD.RegionCounters = CounterPtr;
 
-  if (DebugInfoCorrelate) {
+  if (DebugInfoCorrelate ||
+      ProfileCorrelate == InstrProfCorrelator::DEBUG_INFO) {
     LLVMContext &Ctx = M.getContext();
     Function *Fn = Inc->getParent()->getParent();
     if (auto *SP = Fn->getSubprogram()) {
@@ -1386,7 +1402,7 @@ InstrLowerer::getOrCreateRegionCounters(InstrProfCntrInstBase *Inc) {
 void InstrLowerer::createDataVariable(InstrProfCntrInstBase *Inc) {
   // When debug information is correlated to profile data, a data variable
   // is not needed.
-  if (DebugInfoCorrelate)
+  if (DebugInfoCorrelate || ProfileCorrelate == InstrProfCorrelator::DEBUG_INFO)
     return;
 
   GlobalVariable *NamePtr = Inc->getName();
@@ -1484,20 +1500,28 @@ void InstrLowerer::createDataVariable(InstrProfCntrInstBase *Inc) {
   }
   auto *Data =
       new GlobalVariable(M, DataTy, false, Linkage, nullptr, DataVarName);
-  // Reference the counter variable with a label 
diff erence (link-time
-  // constant).
-  auto *RelativeCounterPtr =
-      ConstantExpr::getSub(ConstantExpr::getPtrToInt(CounterPtr, IntPtrTy),
-                           ConstantExpr::getPtrToInt(Data, IntPtrTy));
-
-  // Bitmaps are relative to the same data variable as profile counters.
+  Constant *RelativeCounterPtr;
   GlobalVariable *BitmapPtr = PD.RegionBitmaps;
   Constant *RelativeBitmapPtr = ConstantInt::get(IntPtrTy, 0);
-
-  if (BitmapPtr != nullptr) {
-    RelativeBitmapPtr =
-        ConstantExpr::getSub(ConstantExpr::getPtrToInt(BitmapPtr, IntPtrTy),
+  InstrProfSectKind DataSectionKind;
+  // With binary profile correlation, profile data is not loaded into memory.
+  // profile data must reference profile counter with an absolute relocation.
+  if (ProfileCorrelate == InstrProfCorrelator::BINARY) {
+    DataSectionKind = IPSK_covdata;
+    RelativeCounterPtr = ConstantExpr::getPtrToInt(CounterPtr, IntPtrTy);
+    if (BitmapPtr != nullptr)
+      RelativeBitmapPtr = ConstantExpr::getPtrToInt(BitmapPtr, IntPtrTy);
+  } else {
+    // Reference the counter variable with a label 
diff erence (link-time
+    // constant).
+    DataSectionKind = IPSK_data;
+    RelativeCounterPtr =
+        ConstantExpr::getSub(ConstantExpr::getPtrToInt(CounterPtr, IntPtrTy),
                              ConstantExpr::getPtrToInt(Data, IntPtrTy));
+    if (BitmapPtr != nullptr)
+      RelativeBitmapPtr =
+          ConstantExpr::getSub(ConstantExpr::getPtrToInt(BitmapPtr, IntPtrTy),
+                               ConstantExpr::getPtrToInt(Data, IntPtrTy));
   }
 
   Constant *DataVals[] = {
@@ -1507,7 +1531,8 @@ void InstrLowerer::createDataVariable(InstrProfCntrInstBase *Inc) {
   Data->setInitializer(ConstantStruct::get(DataTy, DataVals));
 
   Data->setVisibility(Visibility);
-  Data->setSection(getInstrProfSectionName(IPSK_data, TT.getObjectFormat()));
+  Data->setSection(
+      getInstrProfSectionName(DataSectionKind, TT.getObjectFormat()));
   Data->setAlignment(Align(INSTR_PROF_DATA_ALIGNMENT));
   maybeSetComdat(Data, Fn, CntsVarName);
 
@@ -1595,7 +1620,9 @@ void InstrLowerer::emitNameData() {
   NamesSize = CompressedNameStr.size();
   setGlobalVariableLargeSection(TT, *NamesVar);
   NamesVar->setSection(
-      getInstrProfSectionName(IPSK_name, TT.getObjectFormat()));
+      ProfileCorrelate == InstrProfCorrelator::BINARY
+          ? getInstrProfSectionName(IPSK_covname, TT.getObjectFormat())
+          : getInstrProfSectionName(IPSK_name, TT.getObjectFormat()));
   // On COFF, it's important to reduce the alignment down to 1 to prevent the
   // linker from inserting padding before the start of the names section or
   // between names entries.

diff  --git a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
index 57ff1648788f9a..3a57709c4e8b7f 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
@@ -327,6 +327,7 @@ extern cl::opt<PGOViewCountsType> PGOViewCounts;
 // Defined in Analysis/BlockFrequencyInfo.cpp:  -view-bfi-func-name=
 extern cl::opt<std::string> ViewBlockFreqFuncName;
 
+extern cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate;
 } // namespace llvm
 
 static cl::opt<bool>
@@ -381,7 +382,7 @@ static GlobalVariable *createIRLevelProfileFlagVar(Module &M, bool IsCS) {
     ProfileVersion |= VARIANT_MASK_CSIR_PROF;
   if (PGOInstrumentEntry)
     ProfileVersion |= VARIANT_MASK_INSTR_ENTRY;
-  if (DebugInfoCorrelate)
+  if (DebugInfoCorrelate || ProfileCorrelate == InstrProfCorrelator::DEBUG_INFO)
     ProfileVersion |= VARIANT_MASK_DBG_CORRELATE;
   if (PGOFunctionEntryCoverage)
     ProfileVersion |=

diff  --git a/llvm/test/Instrumentation/InstrProfiling/coverage.ll b/llvm/test/Instrumentation/InstrProfiling/coverage.ll
index 1401d8f620b3f4..bbf895ea4b34e1 100644
--- a/llvm/test/Instrumentation/InstrProfiling/coverage.ll
+++ b/llvm/test/Instrumentation/InstrProfiling/coverage.ll
@@ -1,11 +1,19 @@
 ; RUN: opt < %s -passes=instrprof -S | FileCheck %s
+; RUN: opt < %s -passes=instrprof -profile-correlate=binary -S | FileCheck %s --check-prefix=BINARY
 
 target triple = "aarch64-unknown-linux-gnu"
 
 @__profn_foo = private constant [3 x i8] c"foo"
 ; CHECK: @__profc_foo = private global [1 x i8] c"\FF", section "__llvm_prf_cnts", comdat, align 1
+; CHECK: @__profd_foo = private global { i64, i64, i64, i64, ptr, ptr, i32, [2 x i16], i32 } { i64 {{.*}}, i64 {{.*}}, i64 sub (i64 ptrtoint (ptr @__profc_foo to i64)
+; BINARY: @__profd_foo = private global { i64, i64, i64, i64, ptr, ptr, i32, [2 x i16], i32 } { i64 {{.*}}, i64 {{.*}}, i64 ptrtoint (ptr @__profc_foo to i64),
 @__profn_bar = private constant [3 x i8] c"bar"
 ; CHECK: @__profc_bar = private global [1 x i8] c"\FF", section "__llvm_prf_cnts", comdat, align 1
+; CHECK: @__profd_bar = private global { i64, i64, i64, i64, ptr, ptr, i32, [2 x i16], i32 } { i64 {{.*}}, i64 {{.*}}, i64 sub (i64 ptrtoint (ptr @__profc_bar to i64)
+; BINARY: @__profd_bar = private global { i64, i64, i64, i64, ptr, ptr, i32, [2 x i16], i32 } { i64 {{.*}}, i64 {{.*}}, i64 ptrtoint (ptr @__profc_bar to i64),
+
+; CHECK: @__llvm_prf_nm = {{.*}} section "__llvm_prf_names"
+; BINARY: @__llvm_prf_nm ={{.*}} section "__llvm_covnames"
 
 define void @_Z3foov() {
   call void @llvm.instrprof.cover(ptr @__profn_foo, i64 12345678, i32 1, i32 0)

diff  --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 088138f27de58b..322b7da2678f4f 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -46,6 +46,7 @@
 #include <queue>
 
 using namespace llvm;
+using ProfCorrelatorKind = InstrProfCorrelator::ProfCorrelatorKind;
 
 // https://llvm.org/docs/CommandGuide/llvm-profdata.html has documentations
 // on each subcommand.
@@ -124,6 +125,11 @@ cl::opt<std::string> DebugInfoFilename(
         "the functions it found. For merge, use the provided debug info to "
         "correlate the raw profile."),
     cl::sub(ShowSubcommand), cl::sub(MergeSubcommand));
+cl::opt<std::string>
+    BinaryFilename("binary-file", cl::init(""),
+                   cl::desc("For merge, use the provided unstripped bianry to "
+                            "correlate the raw profile."),
+                   cl::sub(MergeSubcommand));
 cl::opt<std::string> FuncNameFilter(
     "function",
     cl::desc("Details for matching functions. For overlapping CSSPGO, this "
@@ -787,14 +793,27 @@ static void mergeInstrProfile(const WeightedFileVector &Inputs,
       OutputFormat != PF_Text)
     exitWithError("unknown format is specified");
 
-  std::unique_ptr<InstrProfCorrelator> Correlator;
+  // TODO: Maybe we should support correlation with mixture of 
diff erent
+  // correlation modes(w/wo debug-info/object correlation).
+  if (!DebugInfoFilename.empty() && !BinaryFilename.empty())
+    exitWithError("Expected only one of -debug-info, -binary-file");
+  std::string CorrelateFilename;
+  ProfCorrelatorKind CorrelateKind = ProfCorrelatorKind::NONE;
   if (!DebugInfoFilename.empty()) {
-    if (auto Err = InstrProfCorrelator::get(DebugInfoFilename,
-                                            InstrProfCorrelator::DEBUG_INFO)
+    CorrelateFilename = DebugInfoFilename;
+    CorrelateKind = ProfCorrelatorKind::DEBUG_INFO;
+  } else if (!BinaryFilename.empty()) {
+    CorrelateFilename = BinaryFilename;
+    CorrelateKind = ProfCorrelatorKind::BINARY;
+  }
+
+  std::unique_ptr<InstrProfCorrelator> Correlator;
+  if (CorrelateKind != InstrProfCorrelator::NONE) {
+    if (auto Err = InstrProfCorrelator::get(CorrelateFilename, CorrelateKind)
                        .moveInto(Correlator))
-      exitWithError(std::move(Err), DebugInfoFilename);
+      exitWithError(std::move(Err), CorrelateFilename);
     if (auto Err = Correlator->correlateProfileData(MaxDbgCorrelationWarnings))
-      exitWithError(std::move(Err), DebugInfoFilename);
+      exitWithError(std::move(Err), CorrelateFilename);
   }
 
   std::mutex ErrorLock;


        


More information about the cfe-commits mailing list