[llvm] 57cb2f6 - Reland "[llvm-cov] Support multi-source object files for convert-for-testing"

Gulfem Savrun Yeniceri via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 18 09:44:07 PDT 2023


Author: Yuhao Gu
Date: 2023-08-18T16:43:45Z
New Revision: 57cb2f6ffe63bbb1b753310aaa102614c00249ec

URL: https://github.com/llvm/llvm-project/commit/57cb2f6ffe63bbb1b753310aaa102614c00249ec
DIFF: https://github.com/llvm/llvm-project/commit/57cb2f6ffe63bbb1b753310aaa102614c00249ec.diff

LOG: Reland "[llvm-cov] Support multi-source object files for convert-for-testing"

`llvm-cov convert-for-testing` only functions properly when the input binary contains a single source file. When the binary has multiple source files, a `Malformed coverage data` error will occur when the generated .covmapping is read back. This is because the testing format lacks support for indicating the size of its file records, and current implementation just assumes there's only one record in it. This patch fixes this problem by introducing a new testing format version.

Changes to the code:

- Add a new format version. The version number is stored in the the last 8 bytes of the orignial magic number field to be backward-compatible.
- Output a LEB128 number before the file records section to indicate its size in the new version.
- Change the format parsing code correspondingly.
- Update the document to formalize the testing format.
- Additionally, fix the bug when converting COFF binaries.

Reviewed By: phosek, gulfem

Differential Revision: https://reviews.llvm.org/D156611

Added: 
    

Modified: 
    llvm/docs/CommandGuide/llvm-cov.rst
    llvm/docs/CoverageMappingFormat.rst
    llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
    llvm/include/llvm/ProfileData/Coverage/CoverageMappingWriter.h
    llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
    llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
    llvm/tools/llvm-cov/TestingSupport.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/docs/CommandGuide/llvm-cov.rst b/llvm/docs/CommandGuide/llvm-cov.rst
index f8b9a399ac8a95..44bb135371512a 100644
--- a/llvm/docs/CommandGuide/llvm-cov.rst
+++ b/llvm/docs/CommandGuide/llvm-cov.rst
@@ -543,3 +543,25 @@ OPTIONS
 
  Fail if an object file cannot be found for a binary ID present in the profile,
  neither on the command line nor via binary ID lookup.
+
+CONVERT-FOR-TESTING COMMAND
+---------------------------
+
+.. warning::
+  This command is for the LLVM developers who are working on ``llvm-cov`` only.
+
+SYNOPSIS
+^^^^^^^^
+
+:program:`llvm-cov convert-for-testing` *BIN* -o *OUT*
+
+DESCRIPTION
+^^^^^^^^^^^
+
+The :program:`llvm-cov convert-for-testing` command serves the purpose of
+testing `llvm-cov` itself. It can extract all code coverage data from the
+binary *BIN* to the file *OUT*, thereby reducing the size of test files. The
+output file typically bears the :program:`.covmapping` extension.
+
+The :program:`.covmapping` files can be read back by ``llvm-cov`` just as
+ordinary binary files.

diff  --git a/llvm/docs/CoverageMappingFormat.rst b/llvm/docs/CoverageMappingFormat.rst
index 0dc5ce6e808ada..7891de008aec85 100644
--- a/llvm/docs/CoverageMappingFormat.rst
+++ b/llvm/docs/CoverageMappingFormat.rst
@@ -318,7 +318,7 @@ In version 2, the function record for *foo* was defined as follows:
 Coverage Mapping Header:
 ------------------------
 
-The coverage mapping header has the following fields:
+As shown above, the coverage mapping header has the following fields:
 
 * The number of function records affixed to the coverage header. Always 0, but present for backwards compatibility.
 
@@ -326,7 +326,7 @@ The coverage mapping header has the following fields:
 
 * The length of the string in the third field of *__llvm_coverage_mapping* that contains any encoded coverage mapping data affixed to the coverage header. Always 0, but present for backwards compatibility.
 
-* The format version. The current version is 4 (encoded as a 3).
+* The format version. The current version is 6 (encoded as a 5).
 
 .. _function records:
 
@@ -610,3 +610,70 @@ The source range record contains the following fields:
 * *columnEnd*: The ending column of the mapping region. If the high bit is set,
   the current mapping region is a gap area. A count for a gap area is only used
   as the line execution count if there are no other regions on a line.
+
+Testing Format
+==============
+
+.. warning::
+  This section is for the LLVM developers who are working on ``llvm-cov`` only.
+
+``llvm-cov`` uses a special file format (called ``.covmapping`` below) for
+testing purposes. This format is private and should have no use for general
+users. As a developer, you can get such files by the ``convert-for-testing``
+subcommand of ``llvm-cov``.
+
+The structure of the ``.covmapping`` files follows:
+
+``[magicNumber : u64, version : u64, profileNames, coverageMapping, coverageRecords]``
+
+Magic Number and Version
+------------------------
+
+The magic is ``0x6d766f636d766c6c``, which is the ASCII string
+``llvmcovm`` in little-endian.
+
+There are two versions for now:
+
+- Version1, encoded as ``0x6174616474736574`` (ASCII string ``testdata``).
+- Version2, encoded as 1.
+
+The only 
diff erence between Version1 and Version2 is in the encoding of the
+``coverageMapping`` fields, which is explained later.
+
+Profile Names
+------------
+
+``profileNames``, ``coverageMapping`` and ``coverageRecords`` are 3 sections
+extracted from the original binary file.
+
+``profileNames`` encodes the size, address and the raw data of the section:
+
+``[profileNamesSize : LEB128, profileNamesAddr : LEB128, profileNamesData : bytes]``
+
+Coverage Mapping
+---------------
+
+This field is padded with zero bytes to make it 8-byte aligned.
+
+``coverageMapping`` contains the records of the source files. In version 1,
+only one record is stored:
+
+``[padding : bytes, coverageMappingData : bytes]``
+
+Version 2 relaxes this restriction by encoding the size of
+``coverageMappingData`` as a LEB128 number before the data:
+
+``[coverageMappingSize : LEB128, padding : bytes, coverageMappingData : bytes]``
+
+The current version is 2.
+
+Coverage Records
+---------------
+
+This field is padded with zero bytes to make it 8-byte aligned.
+
+``coverageRecords`` is encoded as:
+
+``[padding : bytes, coverageRecordsData : bytes]``
+
+The rest data in the file is considered as the ``coverageRecordsData``.

diff  --git a/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h b/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
index 3c8f940ba97b7e..58f4fbd4953ba8 100644
--- a/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
+++ b/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
@@ -1027,6 +1027,20 @@ enum CovMapVersion {
   CurrentVersion = INSTR_PROF_COVMAP_VERSION
 };
 
+// Correspond to "llvmcovm", in little-endian.
+constexpr uint64_t TestingFormatMagic = 0x6d766f636d766c6c;
+
+enum class TestingFormatVersion : uint64_t {
+  // The first version's number corresponds to the string "testdata" in
+  // little-endian. This is for a historical reason.
+  Version1 = 0x6174616474736574,
+  // Version1 has a defect that it can't store multiple file records. Version2
+  // fix this problem by adding a new field before the file records section.
+  Version2 = 1,
+  // The current testing format version is Version2.
+  CurrentVersion = Version2
+};
+
 template <int CovMapVersion, class IntPtrT> struct CovMapTraits {
   using CovMapFuncRecordType = CovMapFunctionRecordV3;
   using NameRefType = uint64_t;

diff  --git a/llvm/include/llvm/ProfileData/Coverage/CoverageMappingWriter.h b/llvm/include/llvm/ProfileData/Coverage/CoverageMappingWriter.h
index 14206755881b92..02848deaba9db1 100644
--- a/llvm/include/llvm/ProfileData/Coverage/CoverageMappingWriter.h
+++ b/llvm/include/llvm/ProfileData/Coverage/CoverageMappingWriter.h
@@ -54,6 +54,27 @@ class CoverageMappingWriter {
   void write(raw_ostream &OS);
 };
 
+/// Writer for the coverage mapping testing format.
+class TestingFormatWriter {
+  uint64_t ProfileNamesAddr;
+  StringRef ProfileNamesData;
+  StringRef CoverageMappingData;
+  StringRef CoverageRecordsData;
+
+public:
+  TestingFormatWriter(uint64_t ProfileNamesAddr, StringRef ProfileNamesData,
+                      StringRef CoverageMappingData,
+                      StringRef CoverageRecordsData)
+      : ProfileNamesAddr(ProfileNamesAddr), ProfileNamesData(ProfileNamesData),
+        CoverageMappingData(CoverageMappingData),
+        CoverageRecordsData(CoverageRecordsData) {}
+
+  /// Encode to the given output stream.
+  void
+  write(raw_ostream &OS,
+        TestingFormatVersion Version = TestingFormatVersion::CurrentVersion);
+};
+
 } // end namespace coverage
 
 } // end namespace llvm

diff  --git a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
index 05737323314a8a..bee9c8d3fce745 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
@@ -822,8 +822,6 @@ static Error readCoverageMappingData(
   return Error::success();
 }
 
-static const char *TestingFormatMagic = "llvmcovmtestdata";
-
 Expected<std::unique_ptr<BinaryCoverageReader>>
 BinaryCoverageReader::createCoverageReaderFromBuffer(
     StringRef Coverage, FuncRecordsStorage &&FuncRecords,
@@ -865,7 +863,16 @@ loadTestingFormat(StringRef Data, StringRef CompilationDir) {
   uint8_t BytesInAddress = 8;
   support::endianness Endian = support::endianness::little;
 
-  Data = Data.substr(StringRef(TestingFormatMagic).size());
+  // Read the magic and version.
+  Data = Data.substr(sizeof(TestingFormatMagic));
+  if (Data.size() < sizeof(uint64_t))
+    return make_error<CoverageMapError>(coveragemap_error::malformed);
+  auto TestingVersion =
+      support::endian::byte_swap<uint64_t, support::endianness::little>(
+          *reinterpret_cast<const uint64_t *>(Data.data()));
+  Data = Data.substr(sizeof(uint64_t));
+
+  // Read the ProfileNames data.
   if (Data.empty())
     return make_error<CoverageMapError>(coveragemap_error::truncated);
   unsigned N = 0;
@@ -886,8 +893,23 @@ loadTestingFormat(StringRef Data, StringRef CompilationDir) {
   if (Error E = ProfileNames.create(Data.substr(0, ProfileNamesSize), Address))
     return std::move(E);
   Data = Data.substr(ProfileNamesSize);
+
+  // In Version2, the size of CoverageMapping is stored directly.
+  uint64_t CoverageMappingSize;
+  if (TestingVersion == uint64_t(TestingFormatVersion::Version2)) {
+    N = 0;
+    CoverageMappingSize = decodeULEB128(Data.bytes_begin(), &N);
+    if (N > Data.size())
+      return make_error<CoverageMapError>(coveragemap_error::malformed);
+    Data = Data.substr(N);
+    if (CoverageMappingSize < sizeof(CovMapHeader))
+      return make_error<CoverageMapError>(coveragemap_error::malformed);
+  } else if (TestingVersion != uint64_t(TestingFormatVersion::Version1)) {
+    return make_error<CoverageMapError>(coveragemap_error::unsupported_version);
+  }
+
   // Skip the padding bytes because coverage map data has an alignment of 8.
-  size_t Pad = offsetToAlignedAddr(Data.data(), Align(8));
+  auto Pad = offsetToAlignedAddr(Data.data(), Align(8));
   if (Data.size() < Pad)
     return make_error<CoverageMapError>(coveragemap_error::malformed);
   Data = Data.substr(Pad);
@@ -895,32 +917,38 @@ loadTestingFormat(StringRef Data, StringRef CompilationDir) {
     return make_error<CoverageMapError>(coveragemap_error::malformed);
   auto const *CovHeader = reinterpret_cast<const CovMapHeader *>(
       Data.substr(0, sizeof(CovMapHeader)).data());
-  CovMapVersion Version =
-      (CovMapVersion)CovHeader->getVersion<support::endianness::little>();
-  StringRef CoverageMapping;
-  BinaryCoverageReader::FuncRecordsStorage CoverageRecords;
+  auto Version =
+      CovMapVersion(CovHeader->getVersion<support::endianness::little>());
+
+  // In Version1, the size of CoverageMapping is calculated.
+  if (TestingVersion == uint64_t(TestingFormatVersion::Version1)) {
+    if (Version < CovMapVersion::Version4) {
+      CoverageMappingSize = Data.size();
+    } else {
+      auto FilenamesSize =
+          CovHeader->getFilenamesSize<support::endianness::little>();
+      CoverageMappingSize = sizeof(CovMapHeader) + FilenamesSize;
+    }
+  }
+
+  auto CoverageMapping = Data.substr(0, CoverageMappingSize);
+  Data = Data.substr(CoverageMappingSize);
+
+  // Read the CoverageRecords data.
   if (Version < CovMapVersion::Version4) {
-    CoverageMapping = Data;
-    if (CoverageMapping.empty())
-      return make_error<CoverageMapError>(coveragemap_error::truncated);
-    CoverageRecords = MemoryBuffer::getMemBuffer("");
+    if (!Data.empty())
+      return make_error<CoverageMapError>(coveragemap_error::malformed);
   } else {
-    uint32_t FilenamesSize =
-        CovHeader->getFilenamesSize<support::endianness::little>();
-    uint32_t CoverageMappingSize = sizeof(CovMapHeader) + FilenamesSize;
-    CoverageMapping = Data.substr(0, CoverageMappingSize);
-    if (CoverageMapping.empty())
-      return make_error<CoverageMapError>(coveragemap_error::truncated);
-    Data = Data.substr(CoverageMappingSize);
     // Skip the padding bytes because coverage records data has an alignment
     // of 8.
     Pad = offsetToAlignedAddr(Data.data(), Align(8));
     if (Data.size() < Pad)
       return make_error<CoverageMapError>(coveragemap_error::malformed);
-    CoverageRecords = MemoryBuffer::getMemBuffer(Data.substr(Pad));
-    if (CoverageRecords->getBufferSize() == 0)
-      return make_error<CoverageMapError>(coveragemap_error::truncated);
+    Data = Data.substr(Pad);
   }
+  BinaryCoverageReader::FuncRecordsStorage CoverageRecords =
+      MemoryBuffer::getMemBuffer(Data);
+
   return BinaryCoverageReader::createCoverageReaderFromBuffer(
       CoverageMapping, std::move(CoverageRecords), std::move(ProfileNames),
       BytesInAddress, Endian, CompilationDir);
@@ -1081,14 +1109,19 @@ BinaryCoverageReader::create(
     StringRef CompilationDir, SmallVectorImpl<object::BuildIDRef> *BinaryIDs) {
   std::vector<std::unique_ptr<BinaryCoverageReader>> Readers;
 
-  if (ObjectBuffer.getBuffer().startswith(TestingFormatMagic)) {
-    // This is a special format used for testing.
-    auto ReaderOrErr =
-        loadTestingFormat(ObjectBuffer.getBuffer(), CompilationDir);
-    if (!ReaderOrErr)
-      return ReaderOrErr.takeError();
-    Readers.push_back(std::move(ReaderOrErr.get()));
-    return std::move(Readers);
+  if (ObjectBuffer.getBuffer().size() > sizeof(TestingFormatMagic)) {
+    uint64_t Magic =
+        support::endian::byte_swap<uint64_t, support::endianness::little>(
+            *reinterpret_cast<const uint64_t *>(ObjectBuffer.getBufferStart()));
+    if (Magic == TestingFormatMagic) {
+      // This is a special format used for testing.
+      auto ReaderOrErr =
+          loadTestingFormat(ObjectBuffer.getBuffer(), CompilationDir);
+      if (!ReaderOrErr)
+        return ReaderOrErr.takeError();
+      Readers.push_back(std::move(ReaderOrErr.get()));
+      return std::move(Readers);
+    }
   }
 
   auto BinOrErr = createBinary(ObjectBuffer);

diff  --git a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
index df65032da517c2..23d86bd05697a8 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
@@ -249,3 +249,37 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
   // Ensure that all file ids have at least one mapping region.
   assert(CurrentFileID == (VirtualFileMapping.size() - 1));
 }
+
+void TestingFormatWriter::write(raw_ostream &OS, TestingFormatVersion Version) {
+  auto ByteSwap = [](uint64_t N) {
+    return support::endian::byte_swap<uint64_t, support::endianness::little>(N);
+  };
+
+  // Output a 64bit magic number.
+  auto Magic = ByteSwap(TestingFormatMagic);
+  OS.write(reinterpret_cast<char *>(&Magic), sizeof(Magic));
+
+  // Output a 64bit version field.
+  auto VersionLittle = ByteSwap(uint64_t(Version));
+  OS.write(reinterpret_cast<char *>(&VersionLittle), sizeof(VersionLittle));
+
+  // Output the ProfileNames data.
+  encodeULEB128(ProfileNamesData.size(), OS);
+  encodeULEB128(ProfileNamesAddr, OS);
+  OS << ProfileNamesData;
+
+  // Version2 adds an extra field to indicate the size of the
+  // CoverageMappingData.
+  if (Version == TestingFormatVersion::Version2)
+    encodeULEB128(CoverageMappingData.size(), OS);
+
+  // Coverage mapping data is expected to have an alignment of 8.
+  for (unsigned Pad = offsetToAlignment(OS.tell(), Align(8)); Pad; --Pad)
+    OS.write(uint8_t(0));
+  OS << CoverageMappingData;
+
+  // Coverage records data is expected to have an alignment of 8.
+  for (unsigned Pad = offsetToAlignment(OS.tell(), Align(8)); Pad; --Pad)
+    OS.write(uint8_t(0));
+  OS << CoverageRecordsData;
+}

diff  --git a/llvm/tools/llvm-cov/TestingSupport.cpp b/llvm/tools/llvm-cov/TestingSupport.cpp
index 289a1621660b06..59e9ee402b9162 100644
--- a/llvm/tools/llvm-cov/TestingSupport.cpp
+++ b/llvm/tools/llvm-cov/TestingSupport.cpp
@@ -6,7 +6,9 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "llvm/Object/COFF.h"
 #include "llvm/Object/ObjectFile.h"
+#include "llvm/ProfileData/Coverage/CoverageMappingWriter.h"
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/Support/Alignment.h"
 #include "llvm/Support/CommandLine.h"
@@ -51,6 +53,28 @@ int convertForTestingMain(int argc, const char *argv[]) {
   int FoundSectionCount = 0;
   SectionRef ProfileNames, CoverageMapping, CoverageRecords;
   auto ObjFormat = OF->getTripleObjectFormat();
+
+  auto ProfileNamesSection = getInstrProfSectionName(IPSK_name, ObjFormat,
+                                                     /*AddSegmentInfo=*/false);
+  auto CoverageMappingSection =
+      getInstrProfSectionName(IPSK_covmap, ObjFormat, /*AddSegmentInfo=*/false);
+  auto CoverageRecordsSection =
+      getInstrProfSectionName(IPSK_covfun, ObjFormat, /*AddSegmentInfo=*/false);
+  if (isa<object::COFFObjectFile>(OF)) {
+    // On COFF, the object file section name may end in "$M". This tells the
+    // linker to sort these sections between "$A" and "$Z". The linker removes
+    // the dollar and everything after it in the final binary. Do the same to
+    // match.
+    auto Strip = [](std::string &Str) {
+      auto Pos = Str.find('$');
+      if (Pos != std::string::npos)
+        Str.resize(Pos);
+    };
+    Strip(ProfileNamesSection);
+    Strip(CoverageMappingSection);
+    Strip(CoverageRecordsSection);
+  }
+
   for (const auto &Section : OF->sections()) {
     StringRef Name;
     if (Expected<StringRef> NameOrErr = Section.getName()) {
@@ -60,16 +84,13 @@ int convertForTestingMain(int argc, const char *argv[]) {
       return 1;
     }
 
-    if (Name == llvm::getInstrProfSectionName(IPSK_name, ObjFormat,
-                                              /*AddSegmentInfo=*/false)) {
+    if (Name == ProfileNamesSection)
       ProfileNames = Section;
-    } else if (Name == llvm::getInstrProfSectionName(
-                           IPSK_covmap, ObjFormat, /*AddSegmentInfo=*/false)) {
+    else if (Name == CoverageMappingSection)
       CoverageMapping = Section;
-    } else if (Name == llvm::getInstrProfSectionName(
-                           IPSK_covfun, ObjFormat, /*AddSegmentInfo=*/false)) {
+    else if (Name == CoverageRecordsSection)
       CoverageRecords = Section;
-    } else
+    else
       continue;
     ++FoundSectionCount;
   }
@@ -106,19 +127,11 @@ int convertForTestingMain(int argc, const char *argv[]) {
     return 1;
   }
 
+  coverage::TestingFormatWriter Writer(ProfileNamesAddress, ProfileNamesData,
+                                       CoverageMappingData,
+                                       CoverageRecordsData);
   raw_fd_ostream OS(FD, true);
-  OS << "llvmcovmtestdata";
-  encodeULEB128(ProfileNamesData.size(), OS);
-  encodeULEB128(ProfileNamesAddress, OS);
-  OS << ProfileNamesData;
-  // Coverage mapping data is expected to have an alignment of 8.
-  for (unsigned Pad = offsetToAlignment(OS.tell(), Align(8)); Pad; --Pad)
-    OS.write(uint8_t(0));
-  OS << CoverageMappingData;
-  // Coverage records data is expected to have an alignment of 8.
-  for (unsigned Pad = offsetToAlignment(OS.tell(), Align(8)); Pad; --Pad)
-    OS.write(uint8_t(0));
-  OS << CoverageRecordsData;
+  Writer.write(OS);
 
   return 0;
 }


        


More information about the llvm-commits mailing list