[PATCH] D130357: [MC,llvm-objdump,ARM] Target-dependent disassembly resync policy.

David Spickett via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 25 03:27:55 PDT 2022


DavidSpickett added inline comments.


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:181
+  ///
+  /// \param Address  - The address, in the memory space of region, of the first
+  ///                   byte of the symbol.
----------------
"memory space of the region" perhaps?


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:182
+  /// \param Address  - The address, in the memory space of region, of the first
+  ///                   byte of the symbol.
+  /// \param Bytes    - A reference to the actual bytes at the symbol location.
----------------
What is "the symbol" here? Maybe from context it will be a symbol but it seems to me like you'd be in the middle of some function maybe. So symbol is "main" but I'm N bytes away from where it starts by this point.

Do I have the wrong end of the stick?


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:183
+  ///                   byte of the symbol.
+  /// \param Bytes    - A reference to the actual bytes at the symbol location.
+  /// \return         - A number of bytes to skip. Must always be greater than
----------------
I'd explain why the parameter exists. You'll probably end up repeating the thumb justification but hey why not.

Also is the size of this known to be in some range or does the target also decide that. E.g. if thumb needed 4 bytes of context to decide, the Arm backend would pass this 4 bytes. Some other target could choose less or more or none.


================
Comment at: llvm/include/llvm/MC/MCDisassembler/MCDisassembler.h:186
+  ///                   zero. May be greater than the size of Bytes.
+  virtual uint64_t suggestBytesToSkip(ArrayRef<uint8_t> Bytes,
+                                      uint64_t Address) const;
----------------
`const ArrayRef<uint8_t>`?


================
Comment at: llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp:353
 
+uint64_t AArch64Disassembler::suggestBytesToSkip(ArrayRef<uint8_t> Bytes,
+                                                 uint64_t Address) const {
----------------
We're assuming that Address is already aligned, I assume that's safe given that we're either:
* disassembling from the start of a fn, which would be aligned
* have just disassembled an instruction, which would have been aligned, and 4 bytes in size.

Correct?

I guess someone could give MC a misaligned function start but garbage in garbage out in that case? Or would you want to align up to the nearest 4 bytes.


================
Comment at: llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp:758
+      if (Bytes.size() >= 2) {
+        uint16_t Insn16 = (Bytes[1] << 8) | Bytes[0];
+        return Insn16 < 0xE800 ? 2 : 4;
----------------
Is the in memory layout of thumb instructions the same between endians?


================
Comment at: llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp:771
+  }
+}
+
----------------
I might have written this early return style, but that's my lldb bias talking.
```
if (!is_thumb)
   return 4;
if (bytes.size < 2)
   return 2;
Insn16 ....
return Insn16 < 0xE800 ? 2 : 4;
```

The logic is clear either way but if you want to have one less indent.


================
Comment at: llvm/test/tools/llvm-objdump/ELF/ARM/unknown-instr-resync.test:1
+# RUN: yaml2obj %s | llvm-objdump -d --mcpu=cortex-a8 - | FileCheck %s
+
----------------
I think this one is the thumb side of the testing, can you add that in a comment in the file if so. Judging by the skip 4 then skip 2 later.

(if this is the thumb test do you need another one for Arm only?)


================
Comment at: llvm/test/tools/llvm-objdump/ELF/ARM/unknown-instr-resync.test:12
+# CHECK-NEXT: 16: ee ff cc dd   <unknown>
+# CHECK-NEXT: 1a: 01 eb c2 00   add.w   r0, r1, r2, lsl #3
+
----------------
Is it worth adding a single byte on the end to cover the thumb path if bytes < 2? Proves you don't out of bounds the bytes you're given, at least in an asserts build (I hope).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130357/new/

https://reviews.llvm.org/D130357



More information about the llvm-commits mailing list