[PATCH] D53379: GSYM symbolication format
Zachary Turner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 23 16:35:24 PDT 2018
zturner added a comment.
I had a chance to review some more. Sorry there's so many comments but it's a large patch so I'm trying to go through it in very detailed.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:32-34
+bool starts_with(const char *line, std::string s) {
+ return strncmp(line, s.c_str(), s.size()) == 0;
+}
----------------
This should be `StringRef`, then you can use the `starts_with` member function.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:36
+
+enum BreakpadLineType {
+ Invalid,
----------------
Can you use an `enum class` here? Also, since this is a type and it's local to this file, it should be in an anonymous namespace.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:46-50
+static std::string BPAD_MODULE("MODULE ");
+static std::string BPAD_FILE("FILE ");
+static std::string BPAD_FUNC("FUNC ");
+static std::string BPAD_PUBLIC("PUBLIC ");
+static std::string BPAD_STACK("STACK ");
----------------
We shouldn't have global constructors. Instead, can you make this a `constexpr StringRef`?
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:52-60
+inline uint8_t char_to_nibble(char c) {
+ if ('0' <= c && c <= '9')
+ return c - '0';
+ if ('a' <= c && c <= 'f')
+ return 10 + c - 'a';
+ if ('A' <= c && c <= 'F')
+ return 10 + c - 'A';
----------------
I believe we have something like this in `StringExtras.h` it's called `llvm::hexDigitValue`
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:63-64
+class Line {
+ const char *m_end;
+ const char *m_pos;
+
----------------
You could store this as a `StringRef` instead of a start and end pointer. It will actually be more space efficient (pointer + length is potentially 4 bytes smaller than pointer + pointer)
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:70
+ BreakpadLineType GetLineType() {
+ if (m_pos < m_end) {
+ switch (m_pos[0]) {
----------------
Early exit.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:101
+ default:
+ if (isxdigit(m_pos[0]))
+ return BreakpadLineType::SourceLine;
----------------
Can you use `llvm::isHexDigit` here instead?
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:108
+ }
+ std::string GetWord() {
+ // Get the next word from the line. Any leading spaces
----------------
This should return a `StringRef`.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:112-124
+ if (m_pos < m_end) {
+ // Skip leading spaces
+ while (m_pos < m_end && isspace(*m_pos)) {
+ ++m_pos;
+ }
+ const auto start = m_pos;
+ while (m_pos < m_end && !isspace(*m_pos)) {
----------------
If you're storing a `StringRef`, then this function becomes:
```
// Remove leading spaces
m_pos = m_pos.ltrim();
// get all characters until the next space
StringRef word = m_pos.take_until([](char c) {return !isspace(c); });
// remove those from the stored string, and remove whitespace
m_pos = m_pos.drop_front(word.size()).ltrim();
// return the word
return word;
```
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:126-130
+ std::string GetRestOfLineAsString() {
+ if (m_pos < m_end - 1)
+ return std::string(m_pos, m_end - 1 - m_pos);
+ return std::string();
+ }
----------------
This just becomes `return m_pos;`
Also, this function should be `const`
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:131-155
+ uint32_t GetHex32() {
+ auto u = GetUnsigned(16);
+ assert(u < UINT32_MAX);
+ return (uint32_t)u;
+ }
+ uint64_t GetHex() { return GetUnsigned(16); }
+ uint64_t GetDecimal() { return GetUnsigned(10); }
----------------
All of these functions should be `const`. That said, I think they're all unnecessary. `llvm::StringRef` has a method called `getAsInteger`. I think we can just use that, so all of these various specialized conversion functions seem unnecessary. The user can just call `GetString()` which returns the rest of the line as a `StringRef`, then they can call `getAsInteger()` with that. And all of these functions can just be deleted.
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:158
+
+std::error_code
+llvm::gsym::convertBreakpadToGSYM(const char *BreakpadPath,
----------------
Can we use `llvm::Error` here?
================
Comment at: lib/DebugInfo/GSYM/Breakpad.cpp:159-160
+std::error_code
+llvm::gsym::convertBreakpadToGSYM(const char *BreakpadPath,
+ const char *GSYMPath) {
+ ErrorOr<std::unique_ptr<MemoryBuffer>> BuffOrErr =
----------------
`StringRef`
================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:167
+
+ const char *name_cstr = die.getName(DINameKind::ShortName);
+ if (!name_cstr || !name_cstr[0]) {
----------------
Can you assign this to a `StringRef` right away so that we don't have to use `strcmp` related functions?
================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:193
+ std::string name = name_cstr;
+ auto parent_die = GetParentDeclContextDIE(die);
+ while (parent_die) {
----------------
General comment, I don't want to point out every single instance, but there's a *lot* of `auto` in this code. It makes it hard to read. I think we prefer not to use so much `auto` in LLVM, so I'd suggest removing some of it and use explicit types. Only when the type is specified in the following code is it ok, like `auto foo = llvm::make_unique<TheType>`, because then you know it's a `unique_ptr<TheType>`, or when it's really obnoxiously long to spell out.
================
Comment at: lib/DebugInfo/GSYM/DwarfTransformer.cpp:594-596
+ // if (loggingEnabled() && f.FuncInfo.hasRichInfo() && last.FuncInfo.hasRichInfo())
+ // log() << "Warning: functions with different names and "
+ // << "same address range: \n\t" << f << "\n\t" << last << "\n";
----------------
Can you remove all the commented out logging code?
Repository:
rL LLVM
https://reviews.llvm.org/D53379
More information about the llvm-commits
mailing list