[llvm-dev] Possible stack corruption during call to JITSymbol::getAddress()

David Lurton via llvm-dev llvm-dev at lists.llvm.org
Sun Apr 9 14:02:14 PDT 2017


Firstly, apologies if this is not the right place to be asking this
question--feel free to point me in the correct direction.  I could be doing
something wrong here but stackoverflow didn't feel like the correct place
for this since there's so little there about LLVM ORC.

Basically, I have a reproduction case (below) where if I throw an exception
before I call JITSymbol::getAddress() everything works properly but
throwing the same exception afterward will result in a SIGSEGV during stack
unwinding.  This suggests to me that somehow the stack is getting corrupted
during the JITSymbol::getAddress() call.

This problem was initially discovered while working on my own project.
While troubleshooting this I've discvoered that when LLVM is
-DLLVM_USE_SANITIZER:STRING=Address the problem happens at different points
during execution, perhaps having something to do with the padding around
the stack variables added by the sanitizer?  See the note after the call to
runTest() in main().

I'm running this under an up-to-date Antergos Linux, clang version: 3.9.1
(tried compiling LLVM and the example program below with gcc 6.3.1 and the
result is the same) clang set to default compiler by setting the following
environment variables:

    CC=/usr/bin/clang
    CXX=/usr/bin/clang++

Commands used to build LLVM:

    git clone https://github.com/llvm-mirror/llvm.git
    git checkout release_40
    cd llvm
    mkdir build
    cd build
    cmake .. -DLLVM_BUILD_LLVM_DYLIB:BOOL=ON -DLLVM_ENABLE_RTTI:BOOL=ON
-DLLVM_ENABLE_EH:BOOL=ON -DLLVM_USE_SANITIZER:STRING=Address
-DLLVM_PARALLEL_COMPILE_JOBS:STRING=8 -DLLVM_ENABLE_ASSERTIONS:BOOL=ON
    cmake --build . -- -j 8
    sudo cmake --build . --target install

Command used to build test case executable:

    clang test.cpp -std=c++14 -lstdc++ -lLLVM-4.0 -Wall -pedantic -Wextra
 -fstack-protector-all -fsanitize=address -fexceptions

Then of course:

    ./a.out

Output from the a.out:

ASAN:DEADLYSIGNAL
=================================================================
==6582==ERROR: AddressSanitizer: SEGV on unknown address 0x7f59eeb06020 (pc
0x7f59f1b20930 bp 0x000000000001 sp 0x7ffc5e546218 T0)
==6582==The signal is caused by a READ memory access.


The result if running `backtrace` in GDB while execution is paused after
the SIGSEGV occurs:

#0  read_encoded_value_with_base (encoding=encoding at entry=28 '\034',
base=base at entry=0, p=p at entry=0x7fffe8a06020 <error: Cannot access memory at
address 0x7fffe8a06020>, val=val at entry=0x7fffffffd6d8) at
/build/gcc/src/gcc/libgcc/unwind-pe.h:252
#1  0x00007fffeba05a61 in binary_search_single_encoding_fdes
(pc=0x7fffeba04426 <_Unwind_Resume+54>, ob=0x0) at
/build/gcc/src/gcc/libgcc/unwind-dw2-fde.c:908
#2  search_object (ob=ob at entry=0x60400001d9d0, pc=pc at entry=0x7fffeba04426
<_Unwind_Resume+54>) at /build/gcc/src/gcc/libgcc/unwind-dw2-fde.c:977
#3  0x00007fffeba05fdd in _Unwind_Find_registered_FDE
(bases=0x7fffffffda78, pc=0x7fffeba04426 <_Unwind_Resume+54>) at
/build/gcc/src/gcc/libgcc/unwind-dw2-fde.c:1013
#4  _Unwind_Find_FDE (pc=0x7fffeba04426 <_Unwind_Resume+54>,
bases=bases at entry=0x7fffffffda78) at
/build/gcc/src/gcc/libgcc/unwind-dw2-fde-dip.c:454
#5  0x00007fffeba02b23 in uw_frame_state_for
(context=context at entry=0x7fffffffd9d0,
fs=fs at entry=0x7fffffffd820) at /build/gcc/src/gcc/libgcc/unwind-dw2.c:1241
#6  0x00007fffeba03d40 in uw_init_context_1
(context=context at entry=0x7fffffffd9d0,
outer_cfa=outer_cfa at entry=0x7fffffffdc00, outer_ra=0x5110fc) at
/build/gcc/src/gcc/libgcc/unwind-dw2.c:1562
#7  0x00007fffeba04427 in _Unwind_Resume (exc=0x60d00000c7b0) at
/build/gcc/src/gcc/libgcc/unwind.inc:224
#8  0x00000000005110fc in runTest () at
/home/dave/projects/untitled/test.cpp:124
#9  0x0000000000511138 in main (argc=1, argv=0x7fffffffe698) at
/home/dave/projects/untitled/test.cpp:132

My test-case is below.  In runTest(), note the commented out throw
statement before symbol.getAddress() and the uncommented one after it.
Also note the comments after the call to runTest() in main().

Thanks.


#include "llvm/ADT/STLExtras.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
#include "llvm/ExecutionEngine/Orc/LambdaResolver.h"
#include "llvm/ExecutionEngine/Orc/ObjectLinkingLayer.h"
#include "llvm/IR/Mangler.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/TargetSelect.h"
#include <iostream>

using namespace llvm;
using namespace llvm::orc;

/** This class taken verbatim from
 *
https://github.com/llvm-mirror/llvm/blob/release_40/examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h
 * This is from the same revision of LLVM I am using (the release_40 branch
as of 4/8/2017)
 */
class KaleidoscopeJIT {
private:
    std::unique_ptr<TargetMachine> TM;
    const DataLayout DL;
    ObjectLinkingLayer<> ObjectLayer;
    IRCompileLayer<decltype(ObjectLayer)> CompileLayer;

public:
    typedef decltype(CompileLayer)::ModuleSetHandleT ModuleHandle;

    KaleidoscopeJIT()
            : TM(EngineBuilder().selectTarget()),
DL(TM->createDataLayout()),
              CompileLayer(ObjectLayer, SimpleCompiler(*TM)) {
        llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);
    }

    TargetMachine &getTargetMachine() { return *TM; }

    ModuleHandle addModule(std::unique_ptr<Module> M) {
        // Build our symbol resolver:
        // Lambda 1: Look back into the JIT itself to find symbols that are
part of
        //           the same "logical dylib".
        // Lambda 2: Search for external symbols in the host process.
        auto Resolver = createLambdaResolver(
                [&](const std::string &Name) {
                    if (auto Sym = CompileLayer.findSymbol(Name, false))
                        return Sym;
                    return JITSymbol(nullptr);
                },
                [](const std::string &Name) {
                    if (auto SymAddr =

RTDyldMemoryManager::getSymbolAddressInProcess(Name))
                        return JITSymbol(SymAddr, JITSymbolFlags::Exported);
                    return JITSymbol(nullptr);
                });

        // Build a singleton module set to hold our module.
        std::vector<std::unique_ptr<Module>> Ms;
        Ms.push_back(std::move(M));

        // Add the set to the JIT with the resolver we created above and a
newly
        // created SectionMemoryManager.
        return CompileLayer.addModuleSet(std::move(Ms),

 make_unique<SectionMemoryManager>(),
                                         std::move(Resolver));
    }

    JITSymbol findSymbol(const std::string Name) {
        std::string MangledName;
        raw_string_ostream MangledNameStream(MangledName);
        Mangler::getNameWithPrefix(MangledNameStream, Name, DL);
        return CompileLayer.findSymbol(MangledNameStream.str(), true);
    }

    void removeModule(ModuleHandle H) {
        CompileLayer.removeModuleSet(H);
    }
};

const std::string FUNC_NAME = "someFunction";

void runTest() {
    llvm::LLVMContext context;
    llvm::IRBuilder<> irBuilder{context};
    KaleidoscopeJIT jit;

    auto module = std::make_unique<llvm::Module>("help", context);
    module->setDataLayout(jit.getTargetMachine().createDataLayout());

    auto function =
llvm::cast<llvm::Function>(module->getOrInsertFunction(FUNC_NAME,

 llvm::Type::getInt32Ty(context), nullptr));

    auto block = llvm::BasicBlock::Create(context, "functionBody",
function);
    irBuilder.SetInsertPoint(block);

    irBuilder.CreateRet(llvm::ConstantInt::get(context, llvm::APInt(32, 1,
true)));
    jit.addModule(std::move(module));

    llvm::JITSymbol symbol = jit.findSymbol(FUNC_NAME);

    //Just to ensure that the symbol is in fact valid (symbol evaluates to
true during execution)
    if(!symbol) {
        throw std::runtime_error("Symbol not found");
    }

    //when uncommented, the throw statement does NOT cause a SIGSEGV.
    //throw std::runtime_error("This should not crash.");
    uint64_t ptr = symbol.getAddress();
    //HOWEVER... a SIGSEGV occurs during stack-unwinding while throwing the
exception below.
    //Hence, the call to symbol.getAddress() must be causing some kind of
memory corruption.
    //My guess is that it's clobbering the stack.
    throw std::runtime_error("This should not crash but does anyway.");

    std::cout << "Ptr is " << ptr << "\n";

    int (*someFuncPtr)() = reinterpret_cast<int (*)()>(ptr);
    //int (*someFuncPtr)() = (int (*)())ptr;

    int returnValue = someFuncPtr();

    std::cout << "Return value is: " << returnValue << "\n";

}

int main(int argc, char **argv) {

    llvm::InitializeNativeTarget();
    llvm::InitializeAllAsmPrinters();

    try {
        runTest();

        //NOTE:  if LLVM is compiled without
-DLLVM_USE_SANITIZER:STRING=Address,
the last throw in runTest() does not cause

        //a SIGSEGV, however this throw will.

        //throw std::runtime_error("This should not crash but does
anyway.");
    } catch(std::runtime_error &e) {
        std::cout << "Exception caught: " << e.what() << "\n";
    }

    llvm::llvm_shutdown();
    return 0;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170409/5001edc8/attachment.html>


More information about the llvm-dev mailing list