[PATCH] D52341: [mctoll] Initial changes for MC to LL raiser that takes a binary and raises it back to llvm bitcode

Aaron Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 20 22:35:10 PDT 2018


asmith created this revision.
asmith added a reviewer: llvm-commits.
Herald added subscribers: jfb, mgrang, aheejin, kristof.beyls, sbc100, mgorny.
Herald added a reviewer: javed.absar.

This is the initial set of changes for a new tool called llvm-mctoll that raises binaries back to llvm bitcode. Currently there is support for raising Arm32 and x64 Linux elf shared libraries and simple executables such as dhrystone.

Here is a summary of features in varying states of completion:

1. Function boundary identification. Analyzes the text section of an elf input binary (executable or shared library) to identify function boundaries.

2. CFG construction. Builds the CFG for a function and the corresponding MachineFunction representation along with the constituent MachineBlocks. The MachineFunction object is used to materialize a Function object by raising the instructions of MachineBlock into BasicBlocks of the Function object.

3. Instruction raising. Stack accesses are abstracted to alloca instructions. Various abstract instruction classes are defined - such as memory referencing instructions, floating point instructions, register move instructions, binary operator instructions, etc.

4. Function prototype discovery. A MachineFunction is analyzed to create an abstract function prototype. The current implementation assumes that the binaries are 64-bit and are compiled from C sources. The function prototype discovery algorithm assumes C calling-conventions and is limited to arguments passed on the stack ( > 6 args is not implemented yet). Calls to variadic functions are discovered by analyzing the instructions. Linkage to external functions (such as to glibc) is handled by maintaining a table of known function signatures.

6. Information from various sections of the ELF binary - such as GOT, PLT, data sections and symbol table is used to materialize materialize machine-independent abstractions such as string constants, external call linkage etc.

7. There are tests that try to cover much of the major functionality for both Arm32 and x64.


Repository:
  rL LLVM

https://reviews.llvm.org/D52341

Files:
  tools/llvm-mctoll/ARM/ARMEliminatePrologEpilog.cpp
  tools/llvm-mctoll/ARM/ARMEliminatePrologEpilog.h
  tools/llvm-mctoll/ARM/ARMFunctionPrototype.cpp
  tools/llvm-mctoll/ARM/ARMFunctionPrototype.h
  tools/llvm-mctoll/ARM/ARMMachineInstructionRaiser.cpp
  tools/llvm-mctoll/ARM/ARMMachineInstructionRaiser.h
  tools/llvm-mctoll/ARM/ARMRaiserBase.h
  tools/llvm-mctoll/ARM/CMakeLists.txt
  tools/llvm-mctoll/CMakeLists.txt
  tools/llvm-mctoll/COFFDump.cpp
  tools/llvm-mctoll/ELFDump.cpp
  tools/llvm-mctoll/EmitRaisedOutputPass.cpp
  tools/llvm-mctoll/EmitRaisedOutputPass.h
  tools/llvm-mctoll/ExternalFunctions.cpp
  tools/llvm-mctoll/ExternalFunctions.h
  tools/llvm-mctoll/LICENSE.TXT
  tools/llvm-mctoll/LLVMBuild.txt
  tools/llvm-mctoll/LLVMVersion.txt
  tools/llvm-mctoll/MCInstOrData.cpp
  tools/llvm-mctoll/MCInstOrData.h
  tools/llvm-mctoll/MCInstRaiser.cpp
  tools/llvm-mctoll/MCInstRaiser.h
  tools/llvm-mctoll/MachODump.cpp
  tools/llvm-mctoll/MachineFunctionRaiser.cpp
  tools/llvm-mctoll/MachineFunctionRaiser.h
  tools/llvm-mctoll/MachineInstructionRaiser.h
  tools/llvm-mctoll/ModuleRaiser.h
  tools/llvm-mctoll/README.md
  tools/llvm-mctoll/WasmDump.cpp
  tools/llvm-mctoll/X86/CMakeLists.txt
  tools/llvm-mctoll/X86/X86AdditionalInstrInfo.h
  tools/llvm-mctoll/X86/X86MachineInstructionRaiser.cpp
  tools/llvm-mctoll/X86/X86MachineInstructionRaiser.h
  tools/llvm-mctoll/llvm-mctoll.cpp
  tools/llvm-mctoll/llvm-mctoll.h
  tools/llvm-mctoll/test/CMakeLists.txt
  tools/llvm-mctoll/test/dhrystone/README
  tools/llvm-mctoll/test/dhrystone/dhry.h
  tools/llvm-mctoll/test/dhrystone/dhry.test
  tools/llvm-mctoll/test/dhrystone/dhry_funcs_mod.c
  tools/llvm-mctoll/test/dhrystone/dhry_main.c
  tools/llvm-mctoll/test/dhrystone/lit.local.cfg
  tools/llvm-mctoll/test/lit.cfg
  tools/llvm-mctoll/test/lit.site.cfg.in
  tools/llvm-mctoll/test/smoke_test/Inputs/factorial.c
  tools/llvm-mctoll/test/smoke_test/Inputs/fibfunc.c
  tools/llvm-mctoll/test/smoke_test/Inputs/globalvar.c
  tools/llvm-mctoll/test/smoke_test/Inputs/simple-phi.c
  tools/llvm-mctoll/test/smoke_test/Inputs/strcmp.c
  tools/llvm-mctoll/test/smoke_test/Inputs/test-1.c
  tools/llvm-mctoll/test/smoke_test/Inputs/test-2.c
  tools/llvm-mctoll/test/smoke_test/Inputs/test-3.c
  tools/llvm-mctoll/test/smoke_test/factorial-test.c
  tools/llvm-mctoll/test/smoke_test/fibfunc-test.c
  tools/llvm-mctoll/test/smoke_test/globalvar-test.c
  tools/llvm-mctoll/test/smoke_test/hello.c
  tools/llvm-mctoll/test/smoke_test/lit.local.cfg
  tools/llvm-mctoll/test/smoke_test/simple-phi-test.c
  tools/llvm-mctoll/test/smoke_test/strcmp-test.c
  tools/llvm-mctoll/test/smoke_test/test-1-test.c
  tools/llvm-mctoll/test/smoke_test/test-2-test.c
  tools/llvm-mctoll/test/smoke_test/test-3-test.c
  tools/llvm-mctoll/test/smoke_test/variadic-call-test.c
  tools/llvm-mctoll/test/unittests/ARM/FunctionPrototype/lit.local.cfg
  tools/llvm-mctoll/test/unittests/ARM/FunctionPrototype/six_args.c
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/lit.local.cfg
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/push_pop.s
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/push_pop_branch.s
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/stm_ldm_no_frame.s
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/stm_ldm_with_frame.s
  tools/llvm-mctoll/test/unittests/ARM/PrologEpilog/str_ldr.s
  tools/llvm-mctoll/test/unittests/ARMInstrs/add-dis.ll
  tools/llvm-mctoll/test/unittests/ARMInstrs/add.s
  tools/llvm-mctoll/test/unittests/ARMInstrs/harness.c
  tools/llvm-mctoll/test/unittests/ARMInstrs/inst-test.py





More information about the llvm-commits mailing list