[LLVMdev] Disassembly arbitrary machine-code byte arrays

Aidan Steele llvm at aidans.org
Sun Dec 18 20:29:54 PST 2011


My apologies if this appears to be a very trivial question -- I have
tried to solve this on my own and I am stuck. Any assistance that
could be provided would be immensely appreciated.

What is the absolute bare minimum that I need to do to disassemble an
array of, say, ARM machine code bytes? Or an array of Thumb machine
code bytes? For example, I might have an array of unsigned chars --
how could I go about decoding these into MCInst objects? Does such a
decoding process take place in one fell swoop or do I parse the stream
one instruction at a time? Can I ask it to "decode the next 10 bytes"?
What follows is my (feeble) attempt at getting started. It probably
doesn't help that I am only familiar with C and Objective-C and find
C++ syntax absolutely bewildering.

Kind regards,
Aidan Steele

int main (int argc, const char *argv[])

 const llvm::Target Target;

 llvm::OwningPtr<const llvm::MCSubtargetInfo>
STI(Target.createMCSubtargetInfo("", "", ""));
 llvm::OwningPtr<const llvm::MCDisassembler>

 llvm::OwningPtr<llvm::MemoryBuffer> Buffer;
 llvm::MemoryBuffer::getFile(llvm::StringRef("/path/to/file.bin"), Buffer);
 llvm::MCInst Inst;
 uint64_t Size = 0;

 disassembler->getInstruction(Inst, Size, *Buffer.take(), 0,
llvm::nulls(), llvm::nulls());

//  llvm::StringRef TheArchString("arm-apple-darwin");
//  std::string normalized = llvm::Triple::normalize(TheArchString);
//  llvm::Triple TheTriple;
//  TheTriple.setArch(llvm::Triple::arm);
//  TheTriple.setOS(llvm::Triple::Darwin);
//  TheTriple.setVendor(llvm::Triple::Apple);
//  llvm::Target *TheTarget = NULL;

 return 0;

More information about the llvm-dev mailing list