[PATCH] D60927: [WIP][llvm-objdump] Switch between ARM/Thumb based on mapping symbols.

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 23 06:15:38 PDT 2019


peter.smith added a comment.

I've not got any great ideas about how to move the target switching logic out of llvm-objdump. I think that we would have to either construct a map (Arm/Thumb/Data) in llvm-objdump that we can pass to the disassembler and it can tell from the index which SubTarget to use assuming the disassembler also has the index, or we'd have to make the disassembler aware of symbols so it could look them up.

I think Stephen's suggestion of passing in a parameter of the initial Primary/Secondary based on the triple could be useful. In general for completely stripped binaries with no symbols getting any sensible disassembly is difficult. The only heuristics I know there is to see if the Entry Point has bit 0 set (Thumb) so we say if entry Thumb state then assume a Thumb triple.



================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:576
 
+static bool isArm32Elf(const ObjectFile *Obj) {
+  return (Obj->isELF() &&
----------------
MaskRay wrote:
> efriedma wrote:
> > MaskRay wrote:
> > > Since you are adding `isArm32Elf`, can you add `isArm64Elf` and delete `isArmElf` if that orthogonal property is desired?
> > In some places the property we want is "can this object file have mapping symbols", which is currently equivalent to isArmElf.  I guess I can add a separate helper for that, and make it check "isArmElf32 || isArmElf64"?
> SG
Could we use isAArch64Elf() and isArmElf(), I think that this would be more consistent with the ABI documentation and the current implementation?

Rather than name all the subtargets it might be neater to check the e_machine field is EM_ARM or EM_AARCH64, although handling the elf class and endian may prove to be as complicated.


================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1098
     std::vector<uint64_t> DataMappingSymsAddr;
-    std::vector<uint64_t> TextMappingSymsAddr;
+    std::vector<uint64_t> AArch64TextMappingSymsAddr;
+    std::vector<uint64_t> ThumbMappingSymsAddr;
----------------
Rather than 4 separate vectors, would it be better to have a single one that stores a pair of Index, Letter. In a valid object there is only one mapping symbol at a particular address so each index would match a unique entry. The search code would need to pass in a query function to obtain a symbol of the right letter, but this might be a bit easier to read than 4 separate ones?


================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1268
       while (Index < End) {
         // AArch64 ELF binaries can interleave data and text in the same
         // section. We rely on the markers introduced to understand what we
----------------
Worth updating the comment to "Arm and AArch64 ELF binaries"?


================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1309
+          }
+        }
+
----------------
A way to extend this for Arm/Thumb in the case of an Object with no mapping symbols (Strip?) but other symbols are present then if there is a symbol at (Index  & 0xfffffffe) then:
- Symbol Type == STT_FUNC and bit 0 of symbol value is 1 then switch to Secondary STI
- Symbol Type == STT_FUNC and bit 0 of symbol value is 0 then switch to Primary STI

In most cases this won't add much over just mapping symbols though and could easily be added as a follow up patch. 


================
Comment at: tools/llvm-objdump/llvm-objdump.cpp:1466
                  "no disassembler for target " + TripleName);
+  std::unique_ptr<MCDisassembler> SecondaryDisAsm;
+  std::unique_ptr<const MCSubtargetInfo> SecondarySTI;
----------------
Would it be worth moving this up to line 1452 as the part that adds +thumb-mode is quite far from the part that does -thumb-mode. It took me longer than I'd like to work out why --triple=thumbv7a-linux-gnueabi didn't make the primary and secondary subtargets Thumb. Alternatively maybe worth a comment on 1432 saying something like Primary STI is always ARM secondary is always Thumb regardless of Triple. 


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60927/new/

https://reviews.llvm.org/D60927





More information about the llvm-commits mailing list