[PATCH] D53379: GSYM symbolication format

Zachary Turner via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 23 16:35:24 PDT 2018


zturner added a comment.

I had a chance to review some more.  Sorry there's so many comments but it's a large patch so I'm trying to go through it in very detailed.



================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:32-34
+bool starts_with(const char *line, std::string s) {
+  return strncmp(line, s.c_str(), s.size()) == 0;
+}
----------------
This should be `StringRef`, then you can use the `starts_with` member function.


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:36
+
+enum BreakpadLineType {
+  Invalid,
----------------
Can you use an `enum class` here?  Also, since this is a type and it's local to this file, it should be in an anonymous namespace.


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:46-50
+static std::string BPAD_MODULE("MODULE ");
+static std::string BPAD_FILE("FILE ");
+static std::string BPAD_FUNC("FUNC ");
+static std::string BPAD_PUBLIC("PUBLIC ");
+static std::string BPAD_STACK("STACK ");
----------------
We shouldn't have global constructors.  Instead, can you make this a `constexpr StringRef`? 


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:52-60
+inline uint8_t char_to_nibble(char c) {
+  if ('0' <= c && c <= '9')
+    return c - '0';
+  if ('a' <= c && c <= 'f')
+    return 10 + c - 'a';
+  if ('A' <= c && c <= 'F')
+    return 10 + c - 'A';
----------------
I believe we have something like this in `StringExtras.h`  it's called `llvm::hexDigitValue`


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:63-64
+class Line {
+  const char *m_end;
+  const char *m_pos;
+
----------------
You could store this as a `StringRef` instead of a start and end pointer.  It will actually be more space efficient (pointer + length is potentially 4 bytes smaller than pointer + pointer)


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:70
+  BreakpadLineType GetLineType() {
+    if (m_pos < m_end) {
+      switch (m_pos[0]) {
----------------
Early exit.


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:101
+      default:
+        if (isxdigit(m_pos[0]))
+          return BreakpadLineType::SourceLine;
----------------
Can you use `llvm::isHexDigit` here instead?


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:108
+  }
+  std::string GetWord() {
+    // Get the next word from the line. Any leading spaces
----------------
This should return a `StringRef`.


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:112-124
+    if (m_pos < m_end) {
+      // Skip leading spaces
+      while (m_pos < m_end && isspace(*m_pos)) {
+        ++m_pos;
+      }
+      const auto start = m_pos;
+      while (m_pos < m_end && !isspace(*m_pos)) {
----------------
If you're storing a `StringRef`, then this function becomes:

```
// Remove leading spaces
m_pos = m_pos.ltrim();
// get all characters until the next space
StringRef word = m_pos.take_until([](char c) {return !isspace(c); });
// remove those from the stored string, and remove whitespace
m_pos = m_pos.drop_front(word.size()).ltrim();
// return the word
return word;
```


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:126-130
+  std::string GetRestOfLineAsString() {
+    if (m_pos < m_end - 1)
+      return std::string(m_pos, m_end - 1 - m_pos);
+    return std::string();
+  }
----------------
This just becomes `return m_pos;`

Also, this function should be `const`


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:131-155
+  uint32_t GetHex32() {
+    auto u = GetUnsigned(16);
+    assert(u < UINT32_MAX);
+    return (uint32_t)u;
+  }
+  uint64_t GetHex() { return GetUnsigned(16); }
+  uint64_t GetDecimal() { return GetUnsigned(10); }
----------------
All of these functions should be `const`.  That said, I think they're all unnecessary.  `llvm::StringRef` has a method called `getAsInteger`.  I think we can just use that, so all of these various specialized conversion functions seem unnecessary.  The user can just call `GetString()` which returns the rest of the line as a `StringRef`, then they can call `getAsInteger()` with that.  And all of these functions can just be deleted.


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:158
+
+std::error_code
+llvm::gsym::convertBreakpadToGSYM(const char *BreakpadPath,
----------------
Can we use `llvm::Error` here?


================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:159-160
+std::error_code
+llvm::gsym::convertBreakpadToGSYM(const char *BreakpadPath,
+                                  const char *GSYMPath) {
+  ErrorOr<std::unique_ptr<MemoryBuffer>> BuffOrErr =
----------------
`StringRef`


================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:167
+
+  const char *name_cstr = die.getName(DINameKind::ShortName);
+  if (!name_cstr || !name_cstr[0]) {
----------------
Can you assign this to a `StringRef` right away so that we don't have to use `strcmp` related functions?


================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:193
+  std::string name = name_cstr;
+  auto parent_die = GetParentDeclContextDIE(die);
+  while (parent_die) {
----------------
General comment, I don't want to point out every single instance, but there's a *lot* of `auto` in this code.  It makes it hard to read.  I think we prefer not to use so much `auto` in LLVM, so I'd suggest removing some of it and use explicit types.  Only when the type is specified in the following code is it ok, like `auto foo = llvm::make_unique<TheType>`, because then you know it's a `unique_ptr<TheType>`, or when it's really obnoxiously long to spell out.


================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:594-596
+        // if (loggingEnabled() && f.FuncInfo.hasRichInfo() && last.FuncInfo.hasRichInfo())
+        //   log() << "Warning: functions with different names and "
+        //         << "same address range: \n\t" << f << "\n\t" << last << "\n";
----------------
Can you remove all the commented out logging code?


Repository:
  rL LLVM

https://reviews.llvm.org/D53379





More information about the llvm-commits mailing list