[PATCH] D24636: llvm-objdump Allow disassembly of ARM and thumb code mix in ELF object file.

khemant@codeaurora.org via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 16 09:50:09 PDT 2016


khemant added a comment.

> The first thing I would say is that it really needs a clang-format pass as well as a lot more tests, since the change in functionality is not small.


This was clang-formatted, may be my clang format python script is old.

> From a quick look, it seems that you went for the "whatever works" approach, which makes the whole thing extremely complicated, and harder to maintain. I'm not sure all other architectures will be happy with the added context, complexity and conditional blocks for the sake of one arch (ARM).




> I honestly think this needs a better design and it would be better to get that discussion to the llvm-dev list first, as an RFC.


This is a problem with existing design and is not easily compatible with ARM architecture. Take a look at MachO dumper, it employs similar method to try using both disassembler. But as suggested, I will send an RFC for this functionality on dev list and find out.


================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1092
@@ +1091,3 @@
+    size_t Offset = 1;
+    const uint8_t *Data = AttributeSection.data();
+    // Find the aeabi Vendor sub-section
----------------
rengolin wrote:
> You're converting a pointer to an array ref to another pointer, this is really confusing. Please, pick one style and stick with it.
> 
> even better, isn't there a way to decode the attributes into a nice structure?
> 
> @compnerd, you spent more time in the attributes code than I did, any ideas?
Got it. 

The only way to decode the attributes is to at the size a parse them (they are sorted in ULSB format).

================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1102
@@ +1101,3 @@
+
+      if (std::string("aeabi") !=
+          reinterpret_cast<const char *>(Data + SubOffset)) {
----------------
rengolin wrote:
> StringRef.startsWith("aeabi")?
Got it.

================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1119
@@ +1118,3 @@
+        SubOffset += 4;
+        while (SubOffset < Length) {
+          unsigned TagLen;
----------------
rengolin wrote:
> there has to be a better way to deal with attributes...
I am open for ideas. 

AFAIK there is no way to go past an attribute till you read the size, and since each can vary based on attribute, you have to go over them till you find the one you need or finish reading all of them.

================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1362
@@ -1219,3 +1361,3 @@
         if (Name.startswith("$x"))
-          TextMappingSymsAddr.push_back(Address - SectionAddr);
+          AArch64MappingSymsAddr.push_back(Address - SectionAddr);
         if (Name.startswith("$a"))
----------------
rengolin wrote:
> AArch64 code doesn't mix with AArch32, so this is conceptually wrong.
> 
> If this is just to reduce code duplication in this file, I understand, but there could be an error. It'd be clearer, though, if they were in separate blocks, one for ARM and one for AArch64.
This is not used for mix of ISA. The $d mapping can exist within a region of code that is treated as a single STT_FUNC symbol. The mapping from $x to $d and then back to $x is used to make sure you disassembler is not thrown off balance by trying to dissemble data. This change existed before this change. I made sure the text is separated into arm and thumb and 64 bit arm instructions. Line 1550 will give you an idea of what I am saying.


Repository:
  rL LLVM

https://reviews.llvm.org/D24636





More information about the llvm-commits mailing list