[PATCH] D80512: [MC] Changes to help improve target specific symbol disassembly

Wed May 27 01:35:21 PDT 2020

jhenderson requested changes to this revision.
jhenderson added a comment.
This revision now requires changes to proceed.

Am I right in thinking that this change is now NFC? If so, please add that to the title.

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:76-97
   /// Ternary decode status. Most backends will just use Fail and
   /// Success, however some have a concept of an instruction with
   /// understandable semantics but which is architecturally
   /// incorrect. An example of this is ARM UNPREDICTABLE instructions
   /// which are disassemblable but cause undefined behaviour.
   ///
   /// Because it makes sense to disassemble these instructions, there
----------------
Seems like this whole comment needs updating if you are adding a new enum value.

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:124-125

-  /// May parse any prelude that precedes instructions after the start of a
-  /// symbol. Needed for some targets, e.g. WebAssembly.
+  /// Used to perform separate target specific disassembly for a
+  /// particular symbol. May parse any prelude that precedes instructions
+  /// after the start of a symbol symbol; or the entire symbol.
----------------
Looks like this comment might need reflowing - "particular" at least looks like it belongs on the previous line.

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:126
+  /// particular symbol. May parse any prelude that precedes instructions
+  /// after the start of a symbol symbol; or the entire symbol.
+  /// Right now WebAssembly implements this, and AMDGPU needs it to deccode
----------------
';' -> ','

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:127-128
+  /// after the start of a symbol symbol; or the entire symbol.
+  /// Right now WebAssembly implements this, and AMDGPU needs it to deccode
+  /// kernel descriptors.
+  ///
----------------
I don't think this last sentence is a good comment to have, as it implies that WebAssembly and AMDGPU are the only two cases that make use of this, which might be true now, but probably won't be true in the future. I think it's simplest to say "This is used for example by AMDGPU to decode kernel descriptors."

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:156
+  /// It should help move much of the target specific code from llvm-objdump to
+  /// respective target disassemblers

----------------
Nit: missing trailing full stop.

As it's a TODO, perhaps this whole comment should not use doxygen style comment markers?

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:167
 public:
+  // Helpers to read bytes
+  bool getWord(support::endianness E, ArrayRef<uint8_t> Bytes, size_t index,
----------------
Nit: missing trailing full stop (feel free to add to the comment below too).

================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:168-175
+  bool getWord(support::endianness E, ArrayRef<uint8_t> Bytes, size_t index,
+               uint16_t &word) const;
+
+  bool getDoubleWord(support::endianness E, ArrayRef<uint8_t> Bytes,
+                     size_t index, uint32_t &doubleWord) const;
+
+  bool getQuadWord(support::endianness E, ArrayRef<uint8_t> Bytes, size_t index,
----------------
Variable names should all be UpperCamelCase (index -> Index, doubleWord -> DoubleWord etc). Also applies to code body.

================
Comment at: llvm/lib/MC/MCDisassembler/MCDisassembler.cpp:27
+
+bool MCDisassembler::getWord(support::endianness E, ArrayRef<uint8_t> Bytes,
+                             size_t index, uint16_t &word) const {
----------------
There is already a class `DataExtractor` in Support for reading a series of bytes of various sizes. I think it would make more sense to use that than reinvent the wheel with these methods. It also has error handling, which can be used to detect failures to read and handles endianness.

================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1399
         if (Symbols[SI].Type == ELF::STT_AMDGPU_HSA_KERNEL) {
-          // skip amd_kernel_code_t at the begining of kernel symbol (256 bytes)
+          // skip amd_kernel_code_t at the begining of kernel symbol (256bytes)
           Start += 256;
----------------
If you're going to modify this line, please also add the missing full stop and "skip" -> "Skip". Seems like it's unrelated to this patch.

================
Comment at: llvm/tools/llvm-objdump/llvm-objdump.cpp:1408-1409
           const auto Limit = End - (std::min)(EndAlign, End - Start);
-          while (End > Limit &&
-            *reinterpret_cast<const support::ulittle32_t*>(&Bytes[End - 4]) == 0)
+          while (End > Limit && *reinterpret_cast<const support::ulittle32_t *>(
+                                    &Bytes[End - 4]) == 0)
             End -= 4;
----------------
This also appears to be unrelated to this patch.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80512/new/

https://reviews.llvm.org/D80512