[PATCH] D63104: Add GSYM utility files along with unit tests.

Mon Jun 17 18:03:05 PDT 2019

jakehehrlich added inline comments.

================
Comment at: include/llvm/DebugInfo/GSYM/FunctionInfo.h:97-98
+  // If the number of lines is the same, then compare line table entries
+  if (LHS.Lines.size() == RHS.Lines.size())
+    return LHS.Lines < RHS.Lines;
+  // Then sort by number of line table entries (more is better)
----------------
clayborg wrote:
> The main goal is to be able to sort FuncrtionInfo objects after we fill them out from a variety of places (symtab, DWARF, breakpad) and quickly pick out the one with the best info. We saw many issues with debug info when writing the parser where address ranges would overlap or differ for the same range. So we tried to pick the best info. We can modify this sorting as needed.
Ah sorry I meant to click for the line above where it says `LHS.Lines < RHS.Lines` I mean that that specifically doesn't make sense to me. It's not clear that that would correlate to a better answer in anyway to me.

================
Comment at: include/llvm/DebugInfo/GSYM/Range.h:54
+}
+inline bool operator<(const AddressRange &LHS, const AddressRange &RHS) {
+  if (LHS.Start == RHS.Start)
----------------
clayborg wrote:
> jakehehrlich wrote:
> > If looking up the greatest lower bound on an address in an array sorted by this predicate, the largest range possible would be chosen. It seems to me that we would actually prefer the *smallest* range be chosen because its information would be more specific. That would suggest that instead of  `LHS.End < RHS.End` we should actually use `LHS.End > RHS.End`
> I would expect an address with the same start address to still compare the end address:
> ```
> [0x100-0x101) < [0x100-0x102)
> ```
> with lower bound you always have to know to back up by one in case the the addresses are equal.
You're right. I just got by logic a bit wrong there. That was the result I expected but I got my self turned around in the reasoning somehow.

================
Comment at: include/llvm/DebugInfo/GSYM/Range.h:68
+
+typedef std::vector<AddressRange> AddressRanges;
+bool contains(const AddressRanges &Ranges, uint64_t Addr);
----------------
clayborg wrote:
> jakehehrlich wrote:
> > prefer using
> llvm/include/llvm/DebugInfo/GSYM/Range.h:71:12: error: using declaration cannot refer to a template specialization
> using std::vector<AddressRange> AddressRanges;
>            ^     ~~~~~~~~~~~~~~
> 
`using AddressRanges = std::vector<AddressRange>;` is what I meant.

================
Comment at: lib/DebugInfo/GSYM/Range.cpp:31
+  if (Addr >= Ranges.back().End)
+    return false;
+  auto begin = Ranges.begin();
----------------
clayborg wrote:
> Will remove. Need empty check though.
I think if you remove 

```
if (Pos == EndPos)
    return Ranges.back().contains(Addr);
```

Then in the empty case `Pos == EndPos == Begin` which means `if (Pos != Begin)` will jump to the `return false` line right?

================
Comment at: lib/DebugInfo/GSYM/Range.cpp:34
+  auto EndPos = Ranges.end();
+  auto Pos = std::upper_bound(begin, EndPos, Addr);
+  if (Pos == EndPos)
----------------
clayborg wrote:
> jakehehrlich wrote:
> > In general I think you want a greatest lower bound which is what you're doing but if you keep your current '<' operator and ignore my comment above about swaping the order of 'End' you can use `std::lower_bound` here and avoid the subtraction case.
> I tried that with:
> 
> ```
> bool llvm::gsym::contains(const AddressRanges &Ranges, uint64_t Addr) {
>   if (Ranges.empty())
>     return false;
>   auto Begin = Ranges.begin();
>   auto EndPos = Ranges.end();
>   auto Pos = std::lower_bound(Begin, EndPos, Addr);
>   if (Pos != EndPos)
>     return Pos->contains(Addr);
>   return false;
> }
> ```
> 
> But this failed in the following cases:
> ```
>   EXPECT_TRUE(contains(Ranges, 0x2000-1));
>   EXPECT_TRUE(contains(Ranges, 0x3000-1));
>   EXPECT_TRUE(contains(Ranges, 0x5000-1));
> ```
Ah right, It doesn't give you a strict lower bound. I suppose what you have is best. Why is the first algorithm we learn in school so hard to get right?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63104/new/

https://reviews.llvm.org/D63104