[llvm] 1d9c7c4 - Increase performance of llvm-gsymutil by up to 200%.

Greg Clayton via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 1 13:48:14 PDT 2023


Author: Greg Clayton
Date: 2023-08-01T13:48:06-07:00
New Revision: 1d9c7c4161b9782ed3d0eae0ab776bc25678fd01

URL: https://github.com/llvm/llvm-project/commit/1d9c7c4161b9782ed3d0eae0ab776bc25678fd01
DIFF: https://github.com/llvm/llvm-project/commit/1d9c7c4161b9782ed3d0eae0ab776bc25678fd01.diff

LOG: Increase performance of llvm-gsymutil by up to 200%.

llvm-gsymutil was maintaining an address ranges collection behind a mutex and having the multi-threaded code access this and hold the mutex was causing slowdown when converting DWARF to GSYM. This patch does the following:
- removes the "Ranges" variable from the GsymCreator and any functions and places that used it
- clients don't try to detect if a function has been added for an address range, we now remove any inferior copies of information in the GsymCreator::finalize() routine as was done before, we just have more items to remove, though performance is greator due to less mutex thread locking
- after I started adding all of the inferior funtion info objects the previous patch that tried to remove infrior debug info had bugs in it, so I replace the removeIfBinary() function in GsymCreator with a more efficient and easier to debug way to do things which copies items from the GsymCreator::Funcs into a new vector of FunctionInfo objects and then replaces GsymCreator::Funcs at the end.
- Sorting of FunctionInfo objects has been modified to also compare InlineInfo objects. We found cases where LTO was ruining inline function address ranges and we ended up with a variety of FunctionInfo objects for the same range that had varying amounts of valid debug info. This patch now ensure that two function info objects with different inline info for the same function address range, the best one will be picked to ensure the greatest fidelity.
- If we detect that a DW_TAG_subprogram has inline functions and after parsing it, we don't end up with any valid inline information, we set the optional to std::nullopt to avoid emitting empty inline information and wasting space.

My tests show a 200% perf increase on M1 macs and a 100% performance increase on linux machines for the same complex large DWARF input binary.

Differential Revision: https://reviews.llvm.org/D156773

Added: 
    

Modified: 
    llvm/include/llvm/DebugInfo/GSYM/FunctionInfo.h
    llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h
    llvm/include/llvm/DebugInfo/GSYM/InlineInfo.h
    llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
    llvm/lib/DebugInfo/GSYM/GsymCreator.cpp
    llvm/lib/DebugInfo/GSYM/InlineInfo.cpp
    llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
    llvm/test/tools/llvm-gsymutil/ARM_AArch64/fat-macho-dwarf.yaml
    llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml
    llvm/test/tools/llvm-gsymutil/X86/mach-dwarf.yaml
    llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/DebugInfo/GSYM/FunctionInfo.h b/llvm/include/llvm/DebugInfo/GSYM/FunctionInfo.h
index cf917bf294cfc3..b4d7b47e8dbcff 100644
--- a/llvm/include/llvm/DebugInfo/GSYM/FunctionInfo.h
+++ b/llvm/include/llvm/DebugInfo/GSYM/FunctionInfo.h
@@ -202,21 +202,27 @@ inline bool operator==(const FunctionInfo &LHS, const FunctionInfo &RHS) {
 inline bool operator!=(const FunctionInfo &LHS, const FunctionInfo &RHS) {
   return !(LHS == RHS);
 }
-/// This sorting will order things consistently by address range first, but then
-/// followed by inlining being valid and line tables. We might end up with a
-/// FunctionInfo from debug info that will have the same range as one from the
-/// symbol table, but we want to quickly be able to sort and use the best version
-/// when creating the final GSYM file.
+/// This sorting will order things consistently by address range first, but
+/// then followed by increasing levels of debug info like inline information
+/// and line tables. We might end up with a FunctionInfo from debug info that
+/// will have the same range as one from the symbol table, but we want to
+/// quickly be able to sort and use the best version when creating the final
+/// GSYM file. This function compares the inline information as we have seen
+/// cases where LTO can generate a wide array of 
diff ering inline information,
+/// mostly due to messing up the address ranges for inlined functions, so the
+/// inline information with the most entries will appeear last. If the inline
+/// information match, either by both function infos not having any or both
+/// being exactly the same, we will then compare line tables. Comparing line
+/// tables allows the entry with the most line entries to appear last. This
+/// ensures we are able to save the FunctionInfo with the most debug info into
+/// the GSYM file.
 inline bool operator<(const FunctionInfo &LHS, const FunctionInfo &RHS) {
   // First sort by address range
   if (LHS.Range != RHS.Range)
     return LHS.Range < RHS.Range;
-
-  // Then sort by inline
-  if (LHS.Inline.has_value() != RHS.Inline.has_value())
-    return RHS.Inline.has_value();
-
-  return LHS.OptLineTable < RHS.OptLineTable;
+  if (LHS.Inline == RHS.Inline)
+    return LHS.OptLineTable < RHS.OptLineTable;
+  return LHS.Inline < RHS.Inline;
 }
 
 raw_ostream &operator<<(raw_ostream &OS, const FunctionInfo &R);

diff  --git a/llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h b/llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h
index bca3a83cc68504..fd9204dba80ef3 100644
--- a/llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h
+++ b/llvm/include/llvm/DebugInfo/GSYM/GsymCreator.h
@@ -142,8 +142,8 @@ class GsymCreator {
   std::vector<llvm::gsym::FileEntry> Files;
   std::vector<uint8_t> UUID;
   std::optional<AddressRanges> ValidTextRanges;
-  AddressRanges Ranges;
   std::optional<uint64_t> BaseAddress;
+  bool IsSegment = false;
   bool Finalized = false;
   bool Quiet;
 
@@ -282,6 +282,15 @@ class GsymCreator {
                            llvm::support::endianness ByteOrder,
                            uint64_t SegmentSize) const;
 
+  /// Let this creator know that this is a segment of another GsymCreator.
+  ///
+  /// When we have a segment, we know that function infos will be added in
+  /// ascending address range order without having to be finalized. We also
+  /// don't need to sort and unique entries during the finalize function call.
+  void setIsSegment() {
+    IsSegment = true;
+  }
+
 public:
   GsymCreator(bool Quiet = false);
 
@@ -379,17 +388,6 @@ class GsymCreator {
   /// object.
   size_t getNumFunctionInfos() const;
 
-  /// Check if an address has already been added as a function info.
-  ///
-  /// FunctionInfo data can come from many sources: debug info, symbol tables,
-  /// exception information, and more. Symbol tables should be added after
-  /// debug info and can use this function to see if a symbol's start address
-  /// has already been added to the GsymReader. Calling this before adding
-  /// a function info from a source other than debug info avoids clients adding
-  /// many redundant FunctionInfo objects from many sources only for them to be
-  /// removed during the finalize() call.
-  bool hasFunctionInfoForAddress(uint64_t Addr) const;
-
   /// Set valid .text address ranges that all functions must be contained in.
   void SetValidTextRanges(AddressRanges &TextRanges) {
     ValidTextRanges = TextRanges;

diff  --git a/llvm/include/llvm/DebugInfo/GSYM/InlineInfo.h b/llvm/include/llvm/DebugInfo/GSYM/InlineInfo.h
index 03bc85396d99f3..759ed895d48983 100644
--- a/llvm/include/llvm/DebugInfo/GSYM/InlineInfo.h
+++ b/llvm/include/llvm/DebugInfo/GSYM/InlineInfo.h
@@ -164,6 +164,17 @@ struct InlineInfo {
   /// \returns An error object that indicates success or failure or the
   /// encoding process.
   llvm::Error encode(FileWriter &O, uint64_t BaseAddr) const;
+
+  /// Compare InlineInfo objects.
+  ///
+  /// When comparing InlineInfo objects the item with the most inline functions
+  /// wins. If we have two FunctionInfo objects that both have the same address
+  /// range and both have valid InlineInfo objects, we want the one with the
+  /// most inline functions to win so we save the most information possible
+  /// to the GSYM file. We have seen cases where LTO messes up the inline
+  /// function information for the same address range, so this helps ensure we
+  /// get the most descriptive information we can for an address range.
+  bool operator<(const InlineInfo &RHS) const;
 };
 
 inline bool operator==(const InlineInfo &LHS, const InlineInfo &RHS) {

diff  --git a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
index 2b5420fb5263e4..a3f1a3704ef399 100644
--- a/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
+++ b/llvm/lib/DebugInfo/GSYM/DwarfTransformer.cpp
@@ -215,8 +215,6 @@ static void parseInlineInfo(GsymCreator &Gsym, raw_ostream &Log, CUInfo &CUI,
   if (Tag == dwarf::DW_TAG_inlined_subroutine) {
     // create new InlineInfo and append to parent.children
     InlineInfo II;
-    DWARFAddressRange FuncRange =
-        DWARFAddressRange(FI.startAddress(), FI.endAddress());
     Expected<DWARFAddressRangesVector> RangesOrError = Die.getAddressRanges();
     if (RangesOrError) {
       for (const DWARFAddressRange &Range : RangesOrError.get()) {
@@ -421,6 +419,23 @@ void DwarfTransformer::handleDie(raw_ostream &OS, CUInfo &CUI, DWARFDie Die) {
         FI.Inline->Name = *NameIndex;
         FI.Inline->Ranges.insert(FI.Range);
         parseInlineInfo(Gsym, OS, CUI, Die, 0, FI, *FI.Inline);
+        // Make sure we at least got some valid inline info other than just
+        // the top level function. If we didn't then remove the inline info
+        // from the function info. We have seen cases where LTO tries to modify
+        // the DWARF for functions and it messes up the address ranges for
+        // the inline functions so it is no longer valid.
+        //
+        // By checking if there are any valid children on the top level inline
+        // information object, we will know if we got anything valid from the
+        // debug info.
+        if (FI.Inline->Children.empty()) {
+          if (!Gsym.isQuiet()) {
+            OS << "warning: DIE contains inline function information that has "
+                  "no valid ranges, removing inline information:\n";
+            Die.dump(OS, 0, DIDumpOptions::getForSingleDIE());
+          }
+          FI.Inline = std::nullopt;
+        }
       }
       Gsym.addFunctionInfo(std::move(FI));
     }

diff  --git a/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp b/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp
index 60b6dbc6a12d2a..2cb2419bb3f81d 100644
--- a/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp
+++ b/llvm/lib/DebugInfo/GSYM/GsymCreator.cpp
@@ -187,35 +187,12 @@ llvm::Error GsymCreator::encode(FileWriter &O) const {
   return ErrorSuccess();
 }
 
-// Similar to std::remove_if, but the predicate is binary and it is passed both
-// the previous and the current element.
-template <class ForwardIt, class BinaryPredicate>
-static ForwardIt removeIfBinary(ForwardIt FirstIt, ForwardIt LastIt,
-                                BinaryPredicate Pred) {
-  if (FirstIt != LastIt) {
-    auto PrevIt = FirstIt++;
-    FirstIt = std::find_if(FirstIt, LastIt, [&](const auto &Curr) {
-      return Pred(*PrevIt++, Curr);
-    });
-    if (FirstIt != LastIt)
-      for (ForwardIt CurrIt = FirstIt; ++CurrIt != LastIt;)
-        if (!Pred(*PrevIt, *CurrIt)) {
-          PrevIt = FirstIt;
-          *FirstIt++ = std::move(*CurrIt);
-        }
-  }
-  return FirstIt;
-}
-
 llvm::Error GsymCreator::finalize(llvm::raw_ostream &OS) {
   std::lock_guard<std::mutex> Guard(Mutex);
   if (Finalized)
     return createStringError(std::errc::invalid_argument, "already finalized");
   Finalized = true;
 
-  // Sort function infos so we can emit sorted functions.
-  llvm::sort(Funcs);
-
   // Don't let the string table indexes change by finalizing in order.
   StrTab.finalizeInOrder();
 
@@ -239,83 +216,87 @@ llvm::Error GsymCreator::finalize(llvm::raw_ostream &OS) {
   // Note that in case of (b), we cannot include Y in the result because then
   // we wouldn't find any function for range (end of Y, end of X)
   // with binary search
-  auto NumBefore = Funcs.size();
-  Funcs.erase(
-      removeIfBinary(Funcs.begin(), Funcs.end(),
-                     [&](const auto &Prev, const auto &Curr) {
-                       // Empty ranges won't intersect, but we still need to
-                       // catch the case where we have multiple symbols at the
-                       // same address and coalesce them.
-                       const bool ranges_equal = Prev.Range == Curr.Range;
-                       if (ranges_equal || Prev.Range.intersects(Curr.Range)) {
-                         // Overlapping ranges or empty identical ranges.
-                         if (ranges_equal) {
-                           // Same address range. Check if one is from debug
-                           // info and the other is from a symbol table. If
-                           // so, then keep the one with debug info. Our
-                           // sorting guarantees that entries with matching
-                           // address ranges that have debug info are last in
-                           // the sort.
-                           if (Prev == Curr) {
-                             // FunctionInfo entries match exactly (range,
-                             // lines, inlines)
-
-                             // We used to output a warning here, but this was
-                             // so frequent on some binaries, in particular
-                             // when those were built with GCC, that it slowed
-                             // down processing extremely.
-                             return true;
-                           } else {
-                             if (!Prev.hasRichInfo() && Curr.hasRichInfo()) {
-                               // Same address range, one with no debug info
-                               // (symbol) and the next with debug info. Keep
-                               // the latter.
-                               return true;
-                             } else {
-                               if (!Quiet) {
-                                 OS << "warning: same address range contains "
-                                       "
diff erent debug "
-                                    << "info. Removing:\n"
-                                    << Prev << "\nIn favor of this one:\n"
-                                    << Curr << "\n";
-                               }
-                               return true;
-                             }
-                           }
-                         } else {
-                           if (!Quiet) { // print warnings about overlaps
-                             OS << "warning: function ranges overlap:\n"
-                                << Prev << "\n"
-                                << Curr << "\n";
-                           }
-                         }
-                       } else if (Prev.Range.size() == 0 &&
-                                  Curr.Range.contains(Prev.Range.start())) {
-                         if (!Quiet) {
-                           OS << "warning: removing symbol:\n"
-                              << Prev << "\nKeeping:\n"
-                              << Curr << "\n";
-                         }
-                         return true;
-                       }
-
-                       return false;
-                     }),
-      Funcs.end());
-
-  // If our last function info entry doesn't have a size and if we have valid
-  // text ranges, we should set the size of the last entry since any search for
-  // a high address might match our last entry. By fixing up this size, we can
-  // help ensure we don't cause lookups to always return the last symbol that
-  // has no size when doing lookups.
-  if (!Funcs.empty() && Funcs.back().Range.size() == 0 && ValidTextRanges) {
-    if (auto Range =
-            ValidTextRanges->getRangeThatContains(Funcs.back().Range.start())) {
-      Funcs.back().Range = {Funcs.back().Range.start(), Range->end()};
+
+  const auto NumBefore = Funcs.size();
+  // Only sort and unique if this isn't a segment. If this is a segment we
+  // already finalized the main GsymCreator with all of the function infos
+  // and then the already sorted and uniqued function infos were added to this
+  // object.
+  if (!IsSegment) {
+    if (NumBefore > 1) {
+      // Sort function infos so we can emit sorted functions.
+      llvm::sort(Funcs);
+      std::vector<FunctionInfo> FinalizedFuncs;
+      FinalizedFuncs.reserve(Funcs.size());
+      FinalizedFuncs.emplace_back(std::move(Funcs.front()));
+      for (size_t Idx=1; Idx < NumBefore; ++Idx) {
+        FunctionInfo &Prev = FinalizedFuncs.back();
+        FunctionInfo &Curr = Funcs[Idx];
+        // Empty ranges won't intersect, but we still need to
+        // catch the case where we have multiple symbols at the
+        // same address and coalesce them.
+        const bool ranges_equal = Prev.Range == Curr.Range;
+        if (ranges_equal || Prev.Range.intersects(Curr.Range)) {
+          // Overlapping ranges or empty identical ranges.
+          if (ranges_equal) {
+            // Same address range. Check if one is from debug
+            // info and the other is from a symbol table. If
+            // so, then keep the one with debug info. Our
+            // sorting guarantees that entries with matching
+            // address ranges that have debug info are last in
+            // the sort.
+            if (!(Prev == Curr)) {
+              if (Prev.hasRichInfo() && Curr.hasRichInfo()) {
+                if (!Quiet) {
+                  OS << "warning: same address range contains "
+                        "
diff erent debug "
+                    << "info. Removing:\n"
+                    << Prev << "\nIn favor of this one:\n"
+                    << Curr << "\n";
+                }
+              }
+              // We want to swap the current entry with the previous since
+              // later entries with the same range always have more debug info
+              // or 
diff erent debug info.
+              std::swap(Prev, Curr);
+            }
+          } else {
+            if (!Quiet) { // print warnings about overlaps
+              OS << "warning: function ranges overlap:\n"
+                << Prev << "\n"
+                << Curr << "\n";
+            }
+            FinalizedFuncs.emplace_back(std::move(Curr));
+          }
+        } else {
+          if (Prev.Range.size() == 0 && Curr.Range.contains(Prev.Range.start())) {
+            if (!Quiet) {
+              OS << "warning: removing symbol:\n"
+                << Prev << "\nKeeping:\n"
+                << Curr << "\n";
+            }
+            std::swap(Prev, Curr);
+          } else {
+            FinalizedFuncs.emplace_back(std::move(Curr));
+          }
+        }
+      }
+      std::swap(Funcs, FinalizedFuncs);
     }
+    // If our last function info entry doesn't have a size and if we have valid
+    // text ranges, we should set the size of the last entry since any search for
+    // a high address might match our last entry. By fixing up this size, we can
+    // help ensure we don't cause lookups to always return the last symbol that
+    // has no size when doing lookups.
+    if (!Funcs.empty() && Funcs.back().Range.size() == 0 && ValidTextRanges) {
+      if (auto Range =
+              ValidTextRanges->getRangeThatContains(Funcs.back().Range.start())) {
+        Funcs.back().Range = {Funcs.back().Range.start(), Range->end()};
+      }
+    }
+    OS << "Pruned " << NumBefore - Funcs.size() << " functions, ended with "
+      << Funcs.size() << " total\n";
   }
-  OS << "Pruned " << NumBefore - Funcs.size() << " functions, ended with "
-     << Funcs.size() << " total\n";
   return Error::success();
 }
 
@@ -355,7 +336,6 @@ uint32_t GsymCreator::insertString(StringRef S, bool Copy) {
 
 void GsymCreator::addFunctionInfo(FunctionInfo &&FI) {
   std::lock_guard<std::mutex> Guard(Mutex);
-  Ranges.insert(FI.Range);
   Funcs.emplace_back(std::move(FI));
 }
 
@@ -388,31 +368,24 @@ bool GsymCreator::IsValidTextAddress(uint64_t Addr) const {
   return true; // No valid text ranges has been set, so accept all ranges.
 }
 
-bool GsymCreator::hasFunctionInfoForAddress(uint64_t Addr) const {
-  std::lock_guard<std::mutex> Guard(Mutex);
-  return Ranges.contains(Addr);
-}
-
 std::optional<uint64_t> GsymCreator::getFirstFunctionAddress() const {
-  if (Finalized && !Funcs.empty())
+  // If we have finalized then Funcs are sorted. If we are a segment then
+  // Funcs will be sorted as well since function infos get added from an
+  // already finalized GsymCreator object where its functions were sorted and
+  // uniqued.
+  if ((Finalized || IsSegment) && !Funcs.empty())
     return std::optional<uint64_t>(Funcs.front().startAddress());
-  // This code gets used by the segmentation of GSYM files to help determine the
-  // size of the GSYM header while continually adding new FunctionInfo objects
-  // to this object, so we haven't finalized this object yet.
-  if (Ranges.empty())
-    return std::nullopt;
-  return std::optional<uint64_t>(Ranges.begin()->start());
+  return std::nullopt;
 }
 
 std::optional<uint64_t> GsymCreator::getLastFunctionAddress() const {
-  if (Finalized && !Funcs.empty())
+  // If we have finalized then Funcs are sorted. If we are a segment then
+  // Funcs will be sorted as well since function infos get added from an
+  // already finalized GsymCreator object where its functions were sorted and
+  // uniqued.
+  if ((Finalized || IsSegment) && !Funcs.empty())
     return std::optional<uint64_t>(Funcs.back().startAddress());
-  // This code gets used by the segmentation of GSYM files to help determine the
-  // size of the GSYM header while continually adding new FunctionInfo objects
-  // to this object, so we haven't finalized this object yet.
-  if (Ranges.empty())
-    return std::nullopt;
-  return std::optional<uint64_t>((Ranges.end() - 1)->end());
+  return std::nullopt;
 }
 
 std::optional<uint64_t> GsymCreator::getBaseAddress() const {
@@ -477,7 +450,6 @@ uint64_t GsymCreator::copyFunctionInfo(const GsymCreator &SrcGC, size_t FuncIdx)
   // this GsymCreator and then copy the function info and update the string
   // table offsets to match the new offsets.
   const FunctionInfo &SrcFI = SrcGC.Funcs[FuncIdx];
-  Ranges.insert(SrcFI.Range);
 
   FunctionInfo DstFI;
   DstFI.Range = SrcFI.Range;
@@ -503,7 +475,7 @@ uint64_t GsymCreator::copyFunctionInfo(const GsymCreator &SrcGC, size_t FuncIdx)
     fixupInlineInfo(SrcGC, *DstFI.Inline);
   }
   std::lock_guard<std::mutex> Guard(Mutex);
-  Funcs.push_back(DstFI);
+  Funcs.emplace_back(DstFI);
   return Funcs.back().cacheEncoding();
 }
 
@@ -551,6 +523,10 @@ GsymCreator::createSegment(uint64_t SegmentSize, size_t &FuncIdx) const {
     return std::unique_ptr<GsymCreator>();
 
   std::unique_ptr<GsymCreator> GC(new GsymCreator(/*Quiet=*/true));
+
+  // Tell the creator that this is a segment.
+  GC->setIsSegment();
+
   // Set the base address if there is one.
   if (BaseAddress)
     GC->setBaseAddress(*BaseAddress);

diff  --git a/llvm/lib/DebugInfo/GSYM/InlineInfo.cpp b/llvm/lib/DebugInfo/GSYM/InlineInfo.cpp
index f775ab8fb65c70..ecfb21501eda92 100644
--- a/llvm/lib/DebugInfo/GSYM/InlineInfo.cpp
+++ b/llvm/lib/DebugInfo/GSYM/InlineInfo.cpp
@@ -264,3 +264,14 @@ llvm::Error InlineInfo::encode(FileWriter &O, uint64_t BaseAddr) const {
   }
   return Error::success();
 }
+
+static uint64_t GetTotalNumChildren(const InlineInfo &II) {
+  uint64_t NumChildren = II.Children.size();
+  for (const auto &Child : II.Children)
+    NumChildren += GetTotalNumChildren(Child);
+  return NumChildren;
+}
+
+bool InlineInfo::operator<(const InlineInfo &RHS) const {
+  return GetTotalNumChildren(*this) < GetTotalNumChildren(RHS);
+}

diff  --git a/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp b/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
index ad35aefe777485..a66889a821be09 100644
--- a/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
+++ b/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
@@ -92,8 +92,7 @@ llvm::Error ObjectFileTransformer::convert(const object::ObjectFile &Obj,
       return AddrOrErr.takeError();
 
     if (SymType.get() != SymbolRef::Type::ST_Function ||
-        !Gsym.IsValidTextAddress(*AddrOrErr) ||
-        Gsym.hasFunctionInfoForAddress(*AddrOrErr))
+        !Gsym.IsValidTextAddress(*AddrOrErr))
       continue;
     // Function size for MachO files will be 0
     constexpr bool NoCopy = false;

diff  --git a/llvm/test/tools/llvm-gsymutil/ARM_AArch64/fat-macho-dwarf.yaml b/llvm/test/tools/llvm-gsymutil/ARM_AArch64/fat-macho-dwarf.yaml
index 1beee56d477999..b453f20b13ef17 100644
--- a/llvm/test/tools/llvm-gsymutil/ARM_AArch64/fat-macho-dwarf.yaml
+++ b/llvm/test/tools/llvm-gsymutil/ARM_AArch64/fat-macho-dwarf.yaml
@@ -14,24 +14,24 @@
 # ARMV7:      Input file: {{.*\.yaml\.tmp}}
 # ARMV7-NEXT: Output file (armv7): {{.*\.yaml\.tmp\.armv7\.gsym}}
 # ARMV7-NEXT: Loaded 1 functions from DWARF.
-# ARMV7-NEXT: Loaded 0 functions from symbol table.
-# ARMV7-NEXT: Pruned 0 functions, ended with 1 total
+# ARMV7-NEXT: Loaded 1 functions from symbol table.
+# ARMV7: Pruned 1 functions, ended with 1 total
 
 # ARM64:      Input file: {{.*\.yaml\.tmp}}
 # ARM64-NEXT: Output file (arm64): {{.*\.yaml\.tmp\.arm64\.gsym}}
 # ARM64-NEXT: Loaded 1 functions from DWARF.
-# ARM64-NEXT: Loaded 0 functions from symbol table.
-# ARM64-NEXT: Pruned 0 functions, ended with 1 total
+# ARM64-NEXT: Loaded 1 functions from symbol table.
+# ARM64: Pruned 1 functions, ended with 1 total
 
 # FAT:      Input file: {{.*\.yaml\.tmp}}
 # FAT-NEXT: Output file (armv7): {{.*\.yaml\.tmp\.gsym\.armv7}}
 # FAT-NEXT: Loaded 1 functions from DWARF.
-# FAT-NEXT: Loaded 0 functions from symbol table.
-# FAT-NEXT: Pruned 0 functions, ended with 1 total
+# FAT-NEXT: Loaded 1 functions from symbol table.
+# FAT: Pruned 1 functions, ended with 1 total
 # FAT-NEXT: Output file (arm64): {{.*\.yaml\.tmp\.gsym\.arm64}}
 # FAT-NEXT: Loaded 1 functions from DWARF.
-# FAT-NEXT: Loaded 0 functions from symbol table.
-# FAT-NEXT: Pruned 0 functions, ended with 1 total
+# FAT-NEXT: Loaded 1 functions from symbol table.
+# FAT: Pruned 1 functions, ended with 1 total
 
 --- !fat-mach-o
 FatHeader:

diff  --git a/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml
index 4816d074d711e1..78795c62ec3ed4 100644
--- a/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml
+++ b/llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml
@@ -36,8 +36,8 @@
 # CONVERT: Input file: {{.*\.yaml\.tmp}}
 # CONVERT: Output file (x86_64): {{.*\.yaml\.tmp\.gsym}}
 # CONVERT: Loaded 1 functions from DWARF.
-# CONVERT: Loaded 9 functions from symbol table.
-# CONVERT: Pruned 0 functions, ended with 10 total
+# CONVERT: Loaded 10 functions from symbol table.
+# CONVERT: Pruned 1 functions, ended with 10 total
 
 # DUMP:      Header:
 # DUMP-NEXT:   Magic        = 0x4753594d

diff  --git a/llvm/test/tools/llvm-gsymutil/X86/mach-dwarf.yaml b/llvm/test/tools/llvm-gsymutil/X86/mach-dwarf.yaml
index abf77d09eb5ea0..df0edcb36cc824 100644
--- a/llvm/test/tools/llvm-gsymutil/X86/mach-dwarf.yaml
+++ b/llvm/test/tools/llvm-gsymutil/X86/mach-dwarf.yaml
@@ -11,8 +11,8 @@
 # CONVERT: Input file: {{.*\.yaml\.tmp}}
 # CONVERT: Output file (x86_64): {{.*\.yaml\.tmp\.gsym}}
 # CONVERT: Loaded 2 functions from DWARF.
-# CONVERT: Loaded 0 functions from symbol table.
-# CONVERT: Pruned 0 functions, ended with 2 total
+# CONVERT: Loaded 2 functions from symbol table.
+# CONVERT: Pruned 2 functions, ended with 2 total
 
 # ADDR:      Looking up addresses in "{{.*\.yaml\.tmp\.gsym}}":
 # ADDR-NEXT: 0x0000000000000000: error: address 0x0 is not in GSYM

diff  --git a/llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp b/llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp
index c6a613bd670b0f..f3c1b3adb11405 100644
--- a/llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp
+++ b/llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp
@@ -140,6 +140,21 @@ TEST(GSYMTest, TestFunctionInfo) {
   // that has name, size and any number of line table entries
   EXPECT_LT(FISymtab, FIWithLines);
 
+  // Test that if we have a function info without inline info and one with
+  // that the one without inline info is less than the one with.
+  FunctionInfo FIWithInlines = FISymtab;
+  FIWithInlines.Inline = InlineInfo();
+  FIWithInlines.Inline->Ranges.insert(
+      AddressRange(StartAddr, StartAddr + 0x10));
+  EXPECT_LT(FISymtab, FIWithInlines);
+
+  // Test that if we have a function info with inline entries and one more
+  // inline entries that the one with fewer inline functins is less than the
+  // one with more.
+  FunctionInfo FIWithMoreInlines = FIWithInlines;
+  FIWithMoreInlines.Inline->Children.push_back(InlineInfo());
+  EXPECT_LT(FIWithInlines, FIWithMoreInlines);
+
   FunctionInfo FIWithLinesAndInline = FIWithLines;
   FIWithLinesAndInline.Inline = InlineInfo();
   FIWithLinesAndInline.Inline->Ranges.insert(
@@ -1127,7 +1142,7 @@ TEST(GSYMTest, TestGsymCreator8ByteAddrOffsets) {
 static void VerifyFunctionInfo(const GsymReader &GR, uint64_t Addr,
                                const FunctionInfo &FI) {
   auto ExpFI = GR.getFunctionInfo(Addr);
-  ASSERT_TRUE(bool(ExpFI));
+  ASSERT_THAT_EXPECTED(ExpFI, Succeeded());
   ASSERT_EQ(FI, ExpFI.get());
 }
 
@@ -2652,6 +2667,157 @@ TEST(GSYMTest, TestGsymSegmenting) {
   }
 }
 
+TEST(GSYMTest, TestGsymSegmentingNoBase) {
+  // Test creating a GSYM file with function infos and segment the information.
+  // We verify segmenting is working by creating a full GSYM and also by
+  // encoding multiple segments, then we verify that we get the same information
+  // when doing lookups on the full GSYM that was decoded from encoding the
+  // entire GSYM and also by decoding information from the segments themselves.
+  GsymCreator GC;
+  AddFunctionInfo(GC, "main", 0x1000, "/tmp/main.c", "/tmp/main.h");
+  AddFunctionInfo(GC, "foo", 0x2000, "/tmp/foo.c", "/tmp/foo.h");
+  AddFunctionInfo(GC, "bar", 0x3000, "/tmp/bar.c", "/tmp/bar.h");
+  AddFunctionInfo(GC, "baz", 0x4000, "/tmp/baz.c", "/tmp/baz.h");
+  Expected<GsymReader> GR = FinalizeEncodeAndDecode(GC);
+  ASSERT_THAT_EXPECTED(GR, Succeeded());
+  //GR->dump(outs());
+
+  // Create segmented GSYM files where each file contains 1 function. We will
+  // then test doing lookups on the "GR", or the full GSYM file and then test
+  // doing lookups on the GsymReader objects for each segment to ensure we get
+  // the exact same information. So after all of the code below we will have
+  // GsymReader objects that each contain one function. We name the creators
+  // and readers to match the one and only address they contain.
+  // GC1000 and GR1000 are for [0x1000-0x1030)
+  // GC2000 and GR2000 are for [0x2000-0x2030)
+  // GC3000 and GR3000 are for [0x3000-0x3030)
+  // GC4000 and GR4000 are for [0x4000-0x4030)
+
+  // Create the segments and verify that FuncIdx, an in/out parameter, gets
+  // updated as expected.
+  size_t FuncIdx = 0;
+  // Make sure we get an error if the segment size is too small to encode a
+  // single function info.
+  llvm::Expected<std::unique_ptr<GsymCreator>> GCError =
+      GC.createSegment(57, FuncIdx);
+  ASSERT_FALSE((bool)GCError);
+  checkError("a segment size of 57 is to small to fit any function infos, "
+             "specify a larger value", GCError.takeError());
+  // Make sure that the function index didn't get incremented when we didn't
+  // encode any values into the segmented GsymCreator.
+  ASSERT_EQ(FuncIdx, (size_t)0);
+
+  llvm::Expected<std::unique_ptr<GsymCreator>> GC1000 =
+      GC.createSegment(128, FuncIdx);
+  ASSERT_THAT_EXPECTED(GC1000, Succeeded());
+  ASSERT_EQ(FuncIdx, (size_t)1);
+  llvm::Expected<std::unique_ptr<GsymCreator>> GC2000 =
+      GC.createSegment(128, FuncIdx);
+  ASSERT_THAT_EXPECTED(GC2000, Succeeded());
+  ASSERT_EQ(FuncIdx, (size_t)2);
+  llvm::Expected<std::unique_ptr<GsymCreator>> GC3000 =
+      GC.createSegment(128, FuncIdx);
+  ASSERT_THAT_EXPECTED(GC3000, Succeeded());
+  ASSERT_EQ(FuncIdx, (size_t)3);
+  llvm::Expected<std::unique_ptr<GsymCreator>> GC4000 =
+      GC.createSegment(128, FuncIdx);
+  ASSERT_THAT_EXPECTED(GC4000, Succeeded());
+  ASSERT_EQ(FuncIdx, (size_t)4);
+  // When there are no function infos left to encode we expect to get  no error
+  // and get a NULL GsymCreator in the return value from createSegment.
+  llvm::Expected<std::unique_ptr<GsymCreator>> GCNull =
+      GC.createSegment(128, FuncIdx);
+  ASSERT_THAT_EXPECTED(GCNull, Succeeded());
+  ASSERT_TRUE(GC1000.get() != nullptr);
+  ASSERT_TRUE(GC2000.get() != nullptr);
+  ASSERT_TRUE(GC3000.get() != nullptr);
+  ASSERT_TRUE(GC4000.get() != nullptr);
+  ASSERT_TRUE(GCNull.get() == nullptr);
+  // Encode and decode the GsymReader for each segment and verify they succeed.
+  Expected<GsymReader> GR1000 = FinalizeEncodeAndDecode(*GC1000.get());
+  ASSERT_THAT_EXPECTED(GR1000, Succeeded());
+  Expected<GsymReader> GR2000 = FinalizeEncodeAndDecode(*GC2000.get());
+  ASSERT_THAT_EXPECTED(GR2000, Succeeded());
+  Expected<GsymReader> GR3000 = FinalizeEncodeAndDecode(*GC3000.get());
+  ASSERT_THAT_EXPECTED(GR3000, Succeeded());
+  Expected<GsymReader> GR4000 = FinalizeEncodeAndDecode(*GC4000.get());
+  ASSERT_THAT_EXPECTED(GR4000, Succeeded());
+
+  // Verify that all lookups match the range [0x1000-0x1030) when doing lookups
+  // in the GsymReader that contains all functions and from the segmented
+  // GsymReader in GR1000.
+  for (uint64_t Addr = 0x1000; Addr < 0x1030; ++Addr) {
+    // Lookup in the main GsymReader that contains all function infos
+    auto MainLR = GR->lookup(Addr);
+    ASSERT_THAT_EXPECTED(MainLR, Succeeded());
+    auto SegmentLR = GR1000->lookup(Addr);
+    ASSERT_THAT_EXPECTED(SegmentLR, Succeeded());
+    // Make sure the lookup results match.
+    EXPECT_EQ(MainLR.get(), SegmentLR.get());
+    // Make sure that the lookups on the functions that are not in the segment
+    // fail as expected.
+    ASSERT_THAT_EXPECTED(GR1000->lookup(0x2000), Failed());
+    ASSERT_THAT_EXPECTED(GR1000->lookup(0x3000), Failed());
+    ASSERT_THAT_EXPECTED(GR1000->lookup(0x4000), Failed());
+  }
+
+  // Verify that all lookups match the range [0x2000-0x2030) when doing lookups
+  // in the GsymReader that contains all functions and from the segmented
+  // GsymReader in GR2000.
+  for (uint64_t Addr = 0x2000; Addr < 0x2030; ++Addr) {
+    // Lookup in the main GsymReader that contains all function infos
+    auto MainLR = GR->lookup(Addr);
+    ASSERT_THAT_EXPECTED(MainLR, Succeeded());
+    auto SegmentLR = GR2000->lookup(Addr);
+    ASSERT_THAT_EXPECTED(SegmentLR, Succeeded());
+    // Make sure the lookup results match.
+    EXPECT_EQ(MainLR.get(), SegmentLR.get());
+    // Make sure that the lookups on the functions that are not in the segment
+    // fail as expected.
+    ASSERT_THAT_EXPECTED(GR2000->lookup(0x1000), Failed());
+    ASSERT_THAT_EXPECTED(GR2000->lookup(0x3000), Failed());
+    ASSERT_THAT_EXPECTED(GR2000->lookup(0x4000), Failed());
+
+  }
+
+  // Verify that all lookups match the range [0x3000-0x3030) when doing lookups
+  // in the GsymReader that contains all functions and from the segmented
+  // GsymReader in GR3000.
+  for (uint64_t Addr = 0x3000; Addr < 0x3030; ++Addr) {
+    // Lookup in the main GsymReader that contains all function infos
+    auto MainLR = GR->lookup(Addr);
+    ASSERT_THAT_EXPECTED(MainLR, Succeeded());
+    auto SegmentLR = GR3000->lookup(Addr);
+    ASSERT_THAT_EXPECTED(SegmentLR, Succeeded());
+    // Make sure the lookup results match.
+    EXPECT_EQ(MainLR.get(), SegmentLR.get());
+    // Make sure that the lookups on the functions that are not in the segment
+    // fail as expected.
+    ASSERT_THAT_EXPECTED(GR3000->lookup(0x1000), Failed());
+    ASSERT_THAT_EXPECTED(GR3000->lookup(0x2000), Failed());
+    ASSERT_THAT_EXPECTED(GR3000->lookup(0x4000), Failed());
+}
+
+  // Verify that all lookups match the range [0x4000-0x4030) when doing lookups
+  // in the GsymReader that contains all functions and from the segmented
+  // GsymReader in GR4000.
+  for (uint64_t Addr = 0x4000; Addr < 0x4030; ++Addr) {
+    // Lookup in the main GsymReader that contains all function infos
+    auto MainLR = GR->lookup(Addr);
+    ASSERT_THAT_EXPECTED(MainLR, Succeeded());
+    // Lookup in the GsymReader for that contains 0x4000
+    auto SegmentLR = GR4000->lookup(Addr);
+    ASSERT_THAT_EXPECTED(SegmentLR, Succeeded());
+    // Make sure the lookup results match.
+    EXPECT_EQ(MainLR.get(), SegmentLR.get());
+    // Make sure that the lookups on the functions that are not in the segment
+    // fail as expected.
+    ASSERT_THAT_EXPECTED(GR4000->lookup(0x1000), Failed());
+    ASSERT_THAT_EXPECTED(GR4000->lookup(0x2000), Failed());
+    ASSERT_THAT_EXPECTED(GR4000->lookup(0x3000), Failed());
+  }
+}
+
 
 TEST(GSYMTest, TestDWARFInlineRangeScopes) {
   // Test cases where inlined functions address ranges are not contained in the
@@ -2950,3 +3116,432 @@ TEST(GSYMTest, TestDWARFInlineRangeScopes) {
   EXPECT_EQ(Inline2.CallLine, 13u);
   EXPECT_EQ(Inline2.Children.size(), 0u);
 }
+
+TEST(GSYMTest, TestDWARFEmptyInline) {
+  // Test cases where we have inline function information in the DWARF that
+  // results in us trying to parse the inline info, but since the inline
+  // info ends up not adding any valid inline functions due to ranges
+  // not being correct, we end up not encoding any inline information. This
+  // tests that if we end up creating an empty inline info struct, we end up
+  // not encoding it into the GSYM file.
+  //
+  // 0x0000000b: DW_TAG_compile_unit
+  //               DW_AT_name	("/tmp/main.cpp")
+  //               DW_AT_language	(DW_LANG_C)
+  //               DW_AT_stmt_list	(0x00000000)
+  //
+  // 0x00000015:   DW_TAG_subprogram
+  //                 DW_AT_name	("foo")
+  //                 DW_AT_low_pc	(0x0000000000001000)
+  //                 DW_AT_high_pc	(0x0000000000001050)
+  //
+  // 0x0000002a:     DW_TAG_inlined_subroutine
+  //                   DW_AT_name	("inlineWithInvalidRange")
+  //                   DW_AT_low_pc	(0x0000000000001100)
+  //                   DW_AT_high_pc	(0x0000000000001200)
+  //                   DW_AT_call_file	("/tmp/main.cpp")
+  //                   DW_AT_call_line	(11)
+  //
+  // 0x00000047:     NULL
+  //
+  // 0x00000048:   NULL
+
+  StringRef yamldata = R"(
+  debug_str:
+    - ''
+    - '/tmp/main.cpp'
+    - foo
+    - inlineWithInvalidRange
+  debug_abbrev:
+    - ID:              0
+      Table:
+        - Code:            0x1
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_language
+              Form:            DW_FORM_udata
+            - Attribute:       DW_AT_stmt_list
+              Form:            DW_FORM_sec_offset
+        - Code:            0x2
+          Tag:             DW_TAG_subprogram
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_low_pc
+              Form:            DW_FORM_addr
+            - Attribute:       DW_AT_high_pc
+              Form:            DW_FORM_addr
+        - Code:            0x3
+          Tag:             DW_TAG_inlined_subroutine
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_low_pc
+              Form:            DW_FORM_addr
+            - Attribute:       DW_AT_high_pc
+              Form:            DW_FORM_addr
+            - Attribute:       DW_AT_call_file
+              Form:            DW_FORM_data4
+            - Attribute:       DW_AT_call_line
+              Form:            DW_FORM_data4
+  debug_info:
+    - Length:          0x45
+      Version:         4
+      AbbrevTableID:   0
+      AbbrOffset:      0x0
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x1
+            - Value:           0x2
+            - Value:           0x0
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0xF
+            - Value:           0x1000
+            - Value:           0x1050
+        - AbbrCode:        0x3
+          Values:
+            - Value:           0x13
+            - Value:           0x1100
+            - Value:           0x1200
+            - Value:           0x1
+            - Value:           0xB
+        - AbbrCode:        0x0
+        - AbbrCode:        0x0
+  debug_line:
+    - Length:          76
+      Version:         2
+      PrologueLength:  36
+      MinInstLength:   1
+      DefaultIsStmt:   1
+      LineBase:        251
+      LineRange:       14
+      OpcodeBase:      13
+      StandardOpcodeLengths: [ 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 ]
+      IncludeDirs:
+        - '/tmp'
+      Files:
+        - Name:            main.cpp
+          DirIdx:          1
+          ModTime:         0
+          Length:          0
+      Opcodes:
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          9
+          SubOpcode:       DW_LNE_set_address
+          Data:            4096
+        - Opcode:          DW_LNS_advance_line
+          SData:           9
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            16
+        - Opcode:          DW_LNS_advance_line
+          SData:           1
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            16
+        - Opcode:          DW_LNS_advance_line
+          SData:           1
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            16
+        - Opcode:          DW_LNS_advance_line
+          SData:           1
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            32
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          1
+          SubOpcode:       DW_LNE_end_sequence
+          Data:            0
+  )";
+  auto ErrOrSections = DWARFYAML::emitDebugSections(yamldata);
+  ASSERT_THAT_EXPECTED(ErrOrSections, Succeeded());
+  std::unique_ptr<DWARFContext> DwarfContext =
+      DWARFContext::create(*ErrOrSections, 8);
+  ASSERT_TRUE(DwarfContext.get() != nullptr);
+  std::string errors;
+  raw_string_ostream OS(errors);
+  GsymCreator GC;
+  DwarfTransformer DT(*DwarfContext, OS, GC);
+  const uint32_t ThreadCount = 1;
+  ASSERT_THAT_ERROR(DT.convert(ThreadCount), Succeeded());
+  ASSERT_THAT_ERROR(GC.finalize(OS), Succeeded());
+  SmallString<512> Str;
+  raw_svector_ostream OutStrm(Str);
+  const auto ByteOrder = support::endian::system_endianness();
+  FileWriter FW(OutStrm, ByteOrder);
+  ASSERT_THAT_ERROR(GC.encode(FW), Succeeded());
+  Expected<GsymReader> GR = GsymReader::copyBuffer(OutStrm.str());
+  ASSERT_THAT_EXPECTED(GR, Succeeded());
+  // There should only be one function in our GSYM.
+  EXPECT_EQ(GR->getNumAddresses(), 1u);
+  auto ExpFI = GR->getFunctionInfo(0x1000);
+  ASSERT_THAT_EXPECTED(ExpFI, Succeeded());
+  ASSERT_EQ(ExpFI->Range, AddressRange(0x1000, 0x1050));
+  EXPECT_TRUE(ExpFI->OptLineTable.has_value());
+  EXPECT_FALSE(ExpFI->Inline.has_value());
+  StringRef FuncName = GR->getString(ExpFI->Name);
+  EXPECT_EQ(FuncName, "foo");
+  std::vector<std::string> ExpectedLogErrors = {
+    "error: inlined function DIE at 0x0000002a has a range [0x0000000000001100"
+    " - 0x0000000000001200) that isn't contained in any parent address ranges,"
+    " this inline range will be removed.",
+    "warning: DIE contains inline function information that has no valid "
+    "ranges, removing inline information:",
+  };
+  // Make sure all expected errors are in the error stream for the two invalid
+  // inlined functions that we removed due to invalid range scoping.
+  for (const auto &Error: ExpectedLogErrors) {
+    EXPECT_TRUE(OS.str().find(Error) != std::string::npos);
+  }
+}
+
+TEST(GSYMTest, TestFinalizeForLineTables) {
+  // This example has two compile units:
+  // - one contains a function "foo" with line table entries and "bar" without
+  // - one contains a function "bar" with line table entries and "foo" without
+  // This test ensures that no matter what order information gets processed,
+  // we want to make sure that we prioritize the entries with the most debug
+  // info.
+  //
+  // The DWARF is the same for the functions, but the first compile unit has
+  // lines entries for "foo" and the second one doesn't. And the first compile
+  // unit has no line entries for "bar", but the second one does. We expect the
+  // resulting gsym file to have a "foo" and "bar" that both have line entries.
+  //
+  // 0x0000000b: DW_TAG_compile_unit
+  //               DW_AT_name	("/tmp/main.cpp")
+  //               DW_AT_language	(DW_LANG_C)
+  //               DW_AT_stmt_list	(0x00000000)
+  //
+  // 0x00000015:   DW_TAG_subprogram
+  //                 DW_AT_name	("foo")
+  //                 DW_AT_low_pc	(0x0000000000001000)
+  //                 DW_AT_high_pc	(0x0000000000001050)
+  //
+  // 0x0000002a:   DW_TAG_subprogram
+  //                 DW_AT_name	("bar")
+  //                 DW_AT_low_pc	(0x0000000000002000)
+  //                 DW_AT_high_pc	(0x0000000000002050)
+  //
+  // 0x0000003f:   NULL
+  // 0x00000040: Compile Unit: length = 0x0000003c, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000080)
+  //
+  // 0x0000004b: DW_TAG_compile_unit
+  //               DW_AT_name	("/tmp/main.cpp")
+  //               DW_AT_language	(DW_LANG_C)
+  //               DW_AT_stmt_list	(0x00000043)
+  //
+  // 0x00000055:   DW_TAG_subprogram
+  //                 DW_AT_name	("foo")
+  //                 DW_AT_low_pc	(0x0000000000001000)
+  //                 DW_AT_high_pc	(0x0000000000001050)
+  //
+  // 0x0000006a:   DW_TAG_subprogram
+  //                 DW_AT_name	("bar")
+  //                 DW_AT_low_pc	(0x0000000000002000)
+  //                 DW_AT_high_pc	(0x0000000000002050)
+  //
+  // 0x0000007f:   NULL
+
+  StringRef yamldata = R"(
+  debug_str:
+    - ''
+    - '/tmp/main.cpp'
+    - foo
+    - bar
+  debug_abbrev:
+    - ID:              0
+      Table:
+        - Code:            0x1
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_yes
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_language
+              Form:            DW_FORM_udata
+            - Attribute:       DW_AT_stmt_list
+              Form:            DW_FORM_sec_offset
+        - Code:            0x2
+          Tag:             DW_TAG_subprogram
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+            - Attribute:       DW_AT_low_pc
+              Form:            DW_FORM_addr
+            - Attribute:       DW_AT_high_pc
+              Form:            DW_FORM_addr
+  debug_info:
+    - Length:          0x3C
+      Version:         4
+      AbbrevTableID:   0
+      AbbrOffset:      0x0
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x1
+            - Value:           0x2
+            - Value:           0x0
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0xF
+            - Value:           0x1000
+            - Value:           0x1050
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0x13
+            - Value:           0x2000
+            - Value:           0x2050
+        - AbbrCode:        0x0
+    - Length:          0x3C
+      Version:         4
+      AbbrevTableID:   0
+      AbbrOffset:      0x0
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x1
+            - Value:           0x2
+            - Value:           0x43
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0xF
+            - Value:           0x1000
+            - Value:           0x1050
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0x13
+            - Value:           0x2000
+            - Value:           0x2050
+        - AbbrCode:        0x0
+  debug_line:
+    - Length:          63
+      Version:         2
+      PrologueLength:  36
+      MinInstLength:   1
+      DefaultIsStmt:   1
+      LineBase:        251
+      LineRange:       14
+      OpcodeBase:      13
+      StandardOpcodeLengths: [ 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 ]
+      IncludeDirs:
+        - '/tmp'
+      Files:
+        - Name:            main.cpp
+          DirIdx:          1
+          ModTime:         0
+          Length:          0
+      Opcodes:
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          9
+          SubOpcode:       DW_LNE_set_address
+          Data:            4096
+        - Opcode:          DW_LNS_advance_line
+          SData:           9
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            80
+        - Opcode:          DW_LNS_advance_line
+          SData:           1
+          Data:            0
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          1
+          SubOpcode:       DW_LNE_end_sequence
+          Data:            0
+    - Length:          63
+      Version:         2
+      PrologueLength:  36
+      MinInstLength:   1
+      DefaultIsStmt:   1
+      LineBase:        251
+      LineRange:       14
+      OpcodeBase:      13
+      StandardOpcodeLengths: [ 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 ]
+      IncludeDirs:
+        - '/tmp'
+      Files:
+        - Name:            main.cpp
+          DirIdx:          1
+          ModTime:         0
+          Length:          0
+      Opcodes:
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          9
+          SubOpcode:       DW_LNE_set_address
+          Data:            8192
+        - Opcode:          DW_LNS_advance_line
+          SData:           19
+          Data:            0
+        - Opcode:          DW_LNS_copy
+          Data:            0
+        - Opcode:          DW_LNS_advance_pc
+          Data:            80
+        - Opcode:          DW_LNS_advance_line
+          SData:           1
+          Data:            0
+        - Opcode:          DW_LNS_extended_op
+          ExtLen:          1
+          SubOpcode:       DW_LNE_end_sequence
+          Data:            0
+  )";
+  auto ErrOrSections = DWARFYAML::emitDebugSections(yamldata);
+  ASSERT_THAT_EXPECTED(ErrOrSections, Succeeded());
+  std::unique_ptr<DWARFContext> DwarfContext =
+      DWARFContext::create(*ErrOrSections, 8);
+  ASSERT_TRUE(DwarfContext.get() != nullptr);
+  std::string errors;
+  raw_string_ostream OS(errors);
+  GsymCreator GC;
+  DwarfTransformer DT(*DwarfContext, OS, GC);
+  const uint32_t ThreadCount = 1;
+  ASSERT_THAT_ERROR(DT.convert(ThreadCount), Succeeded());
+  ASSERT_THAT_ERROR(GC.finalize(OS), Succeeded());
+  SmallString<512> Str;
+  raw_svector_ostream OutStrm(Str);
+  const auto ByteOrder = support::endian::system_endianness();
+  FileWriter FW(OutStrm, ByteOrder);
+  ASSERT_THAT_ERROR(GC.encode(FW), Succeeded());
+  Expected<GsymReader> GR = GsymReader::copyBuffer(OutStrm.str());
+  ASSERT_THAT_EXPECTED(GR, Succeeded());
+  // There should only be two functions in our GSYM.
+  EXPECT_EQ(GR->getNumAddresses(), 2u);
+  // Verify "foo" is present and has a line table
+  auto ExpFI = GR->getFunctionInfo(0x1000);
+  ASSERT_THAT_EXPECTED(ExpFI, Succeeded());
+  ASSERT_EQ(ExpFI->Range, AddressRange(0x1000, 0x1050));
+  EXPECT_TRUE(ExpFI->OptLineTable.has_value());
+  EXPECT_FALSE(ExpFI->Inline.has_value());
+  StringRef FuncName = GR->getString(ExpFI->Name);
+  EXPECT_EQ(FuncName, "foo");
+
+  // Verify "foo" is present and has a line table
+  auto ExpFI2 = GR->getFunctionInfo(0x2000);
+  ASSERT_THAT_EXPECTED(ExpFI2, Succeeded());
+  ASSERT_EQ(ExpFI2->Range, AddressRange(0x2000, 0x2050));
+  EXPECT_TRUE(ExpFI2->OptLineTable.has_value());
+  EXPECT_FALSE(ExpFI2->Inline.has_value());
+  StringRef FuncName2 = GR->getString(ExpFI2->Name);
+  EXPECT_EQ(FuncName2, "bar");
+}


        


More information about the llvm-commits mailing list