[LLVMdev] Disassembly arbitrary machine-code byte arrays

Kevin Enderby enderby at apple.com
Mon Dec 19 12:14:52 PST 2011


Hi Aiden,

The 'C' based interface you could use in is llvm/include/llvm-c/Disassembler.h, which in there is:

/**
 * Disassemble a single instruction using the disassembler context specified in
 * the parameter DC.  The bytes of the instruction are specified in the
 * parameter Bytes, and contains at least BytesSize number of bytes.  The
 * instruction is at the address specified by the PC parameter.  If a valid
 * instruction can be disassembled, its string is returned indirectly in
 * OutString whose size is specified in the parameter OutStringSize.  This
 * function returns the number of bytes in the instruction or zero if there was
 * no valid instruction.
 */
size_t LLVMDisasmInstruction(LLVMDisasmContextRef DC, uint8_t *Bytes,
                             uint64_t BytesSize, uint64_t PC,
                             char *OutString, size_t OutStringSize);

This is used in darwin's otool(1) which is an objdump(1) like tool.  It ends up in the libLTO shared library.

Kev

On Dec 19, 2011, at 1:23 AM, James Molloy wrote:

> Hi Aiden,
> 
> The easiest thing I can do is to point you to the source of the "llvm-mc" tool, which does exactly what you ask in its "-disassemble" mode. The code is rather small, so it should be easy to work out.
> 
> tools/llvm-mc
> 
> Cheers,
> 
> James
> 
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Aidan Steele
> Sent: 19 December 2011 04:30
> To: llvmdev at cs.uiuc.edu
> Subject: [LLVMdev] Disassembly arbitrary machine-code byte arrays
> 
> Hi,
> 
> My apologies if this appears to be a very trivial question -- I have
> tried to solve this on my own and I am stuck. Any assistance that
> could be provided would be immensely appreciated.
> 
> What is the absolute bare minimum that I need to do to disassemble an
> array of, say, ARM machine code bytes? Or an array of Thumb machine
> code bytes? For example, I might have an array of unsigned chars --
> how could I go about decoding these into MCInst objects? Does such a
> decoding process take place in one fell swoop or do I parse the stream
> one instruction at a time? Can I ask it to "decode the next 10 bytes"?
> What follows is my (feeble) attempt at getting started. It probably
> doesn't help that I am only familiar with C and Objective-C and find
> C++ syntax absolutely bewildering.
> 
> Kind regards,
> Aidan Steele
> 
> int main (int argc, const char *argv[])
> {
> LLVMInitializeARMTargetInfo();
> LLVMInitializeARMTargetMC();
> LLVMInitializeARMAsmParser();
> LLVMInitializeARMDisassembler();
> 
> const llvm::Target Target;
> 
> llvm::OwningPtr<const llvm::MCSubtargetInfo>
> STI(Target.createMCSubtargetInfo("", "", ""));
> llvm::OwningPtr<const llvm::MCDisassembler>
> disassembler(Target.createMCDisassembler(*STI));
> 
> llvm::OwningPtr<llvm::MemoryBuffer> Buffer;
> llvm::MemoryBuffer::getFile(llvm::StringRef("/path/to/file.bin"), Buffer);
> llvm::MCInst Inst;
> uint64_t Size = 0;
> 
> disassembler->getInstruction(Inst, Size, *Buffer.take(), 0,
> llvm::nulls(), llvm::nulls());
> 
> //  llvm::StringRef TheArchString("arm-apple-darwin");
> //  std::string normalized = llvm::Triple::normalize(TheArchString);
> //
> //  llvm::Triple TheTriple;
> //  TheTriple.setArch(llvm::Triple::arm);
> //  TheTriple.setOS(llvm::Triple::Darwin);
> //  TheTriple.setVendor(llvm::Triple::Apple);
> //  llvm::Target *TheTarget = NULL;
> 
> return 0;
> }
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list