[LLVMdev] load bytecode from string for jiting problem

Vikas Bhargava vikasbhargava at gmail.com
Wed Mar 19 16:37:55 PDT 2014


segmentation fault indicates memory corruption and it's hard to tell
without seeing the exact use of the APIs. If possible, please post a
complete program and gdb stack trace from the core file. If there are
multiple threads using the global variables, please let us know.

FWIW, I have some tests to write llvm::Module to bitcode files and read
them back into llvm::Module and they work just fine with 3.4 (never tried
with tip).

thx
vikas.
========


On Wed, Mar 19, 2014 at 2:58 PM, Willy WOLFF <willy.wolff at etu.unistra.fr>wrote:

> all of:
> ----
>                                 // cout << "lsr: " << lsr << "\n";
>                                 llvm::MemoryBuffer* mbjit =
>
> llvm::MemoryBuffer::getMemBufferCopy (sr);
> ------
> string lsr = sr.str();
>                                 // cout << "lsr: " << lsr << "\n";
>                                 llvm::MemoryBuffer* mbjit =
>                                         llvm::MemoryBuffer::getMemBuffer
> (lsr);
> -------
> string lsr = sr.str();
>                                 // cout << "lsr: " << lsr << "\n";
>                                 llvm::MemoryBuffer* mbjit =
>
> llvm::MemoryBuffer::getMemBufferCopy (lsr);
>
>
> have same result as invalid bit code.
> The result of valgrind, effectively, i have invalid reads in the
> parseBitcodeFile:
>
> ==536== Conditional jump or move depends on uninitialised value(s)
> ==536==    at 0x501FE3: llvm::BitstreamCursor::Read(unsigned int) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x501A19:
> llvm::BitcodeReader::ParseBitcodeInto(llvm::Module*) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50AEC8: llvm::getLazyBitcodeModule(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50B295: llvm::parseBitcodeFile(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F1231: blah_runtime_hook (runtime.cpp:348)
> ==536==    by 0x4F46C2: ??? (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F2B60: main (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==
> ==536== Invalid read of size 8
> ==536==    at 0x501FE8: llvm::BitstreamCursor::Read(unsigned int) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x501A19:
> llvm::BitcodeReader::ParseBitcodeInto(llvm::Module*) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50AEC8: llvm::getLazyBitcodeModule(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50B295: llvm::parseBitcodeFile(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F1231: blah_runtime_hook (runtime.cpp:348)
> ==536==    by 0x4F46C2: ??? (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F2B60: main (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==536==
> ==536==
> ==536== Process terminating with default action of signal 11 (SIGSEGV)
> ==536==  Access not within mapped region at address 0x0
> ==536==    at 0x501FE8: llvm::BitstreamCursor::Read(unsigned int) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x501A19:
> llvm::BitcodeReader::ParseBitcodeInto(llvm::Module*) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50AEC8: llvm::getLazyBitcodeModule(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x50B295: llvm::parseBitcodeFile(llvm::MemoryBuffer*,
> llvm::LLVMContext&) (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F1231: blah_runtime_hook (runtime.cpp:348)
> ==536==    by 0x4F46C2: ??? (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
> ==536==    by 0x4F2B60: main (in
> /home/willy/blah_test_script/new_blah/simple_scev_dynamic_array)
>
>
>
> *--*
> *Willy WOLFF*
>
> On 19 Mar 2014, at 22:11, Vikas Bhargava wrote:
>
> Hi Willy,
> If the disassembly of the module works fine, then there is nothing wrong
> with the module.
>
> Stream uses the memorybuffer that you pass in parseBitcodeFile. If what
> Will is saying is true, there is something wrong with your code in "3:",
> i.e.:
>
> MemoryBuffer* mbjit = MemoryBuffer::getMemBuffer (sr.str());
>   LLVMContext& context = getGlobalContext();
>   ErrorOr<Module*> ModuleOrErr = parseBitcodeFile (mbjit, context);
>   if (error_code EC = ModuleOrErr.getError())
>   {
>     std::cout << ModuleOrErr.getError().
> message() << "\n";
>     assert(false);
>   }
>
> Can you post how you modified it in your second reply? For debugging
> purpose, you can simply use MemoryBuffer::getMemBufferCopy() and not worry
> about validity of stringref or null-termination. Also, you can run your
> program through valgrind and check for any invalid reads.
>
> HTH
> Vikas.
> =======
>
>
>
> On Wed, Mar 19, 2014 at 10:32 AM, Willy WOLFF <willy.wolff at etu.unistra.fr>wrote:
>
>> I mad the change, and still have the problem.
>>
>> I investigate more the source code of llvm.
>>
>> First, I change isRawBitcode function to print the content of the
>> parameter like this:
>> original: http://llvm.org/docs/doxygen/html/ReaderWriter_8h_source.
>> html#l00081
>>
>>   inline bool isRawBitcode(const unsigned char *BufPtr,
>>                            const unsigned char *BufEnd) {
>>     // These bytes sort of have a hidden message, but it's not in
>>     // little-endian this time, and it's a little redundant.
>>           errs()<< "isRawBitcode output:\n";
>>           for (int i = 0; i < 4; i++)
>>                   errs() << BufPtr[i] << "\n";
>>           if (BufPtr != BufEnd )
>>                 errs() << "BP != BE ok\n";
>>           if (BufPtr[0] == 'B')
>>                 errs() << "B ok\n";
>>           if (BufPtr[1] == 'C')
>>                 errs() << "C ok\n";
>>           if (BufPtr[2] ==  0xc0)
>>                 errs() << "0xc0 ok\n";
>>           if (BufPtr[3] ==  0xde)
>>                 errs() << "0xde ok\n";
>>
>>     return BufPtr != BufEnd &&
>>            BufPtr[0] == 'B' &&
>>            BufPtr[1] == 'C' &&
>>            BufPtr[2] == 0xc0 &&
>>            BufPtr[3] == 0xde;
>>   }
>>
>>
>> Second, I change ParseBitcodeInto as this:
>> original: http://llvm.org/docs/doxygen/html/BitcodeReader_8cpp_
>> source.html#l01971
>> ...
>>         errs() << "parsebitcodeinto sniff the signature\n";
>>         uint32_t bvar = Stream.Read(8);
>>                         errs() << "B :" << bvar << "\n";
>>         if (bvar != 'B') {
>>                 errs() << "B :" << bvar << "\n";
>>                 return Error(InvalidBitcodeSignature);
>>         }
>>
>>         if (Stream.Read(8) != 'C') {
>>                 errs() << "C\n";
>>                 return Error(InvalidBitcodeSignature);
>>         }
>>         if (  Stream.Read(8) != 0xc0 ) {
>>                 errs() << "0xc0\n";
>>                 return Error(InvalidBitcodeSignature);
>>         }
>>         if (  Stream.Read(8) != 0xde ) {
>>                 errs() << "0xde\n";
>>                 return Error(InvalidBitcodeSignature);
>>         }
>>         // if (Stream.Read(8) != 'B' ||
>>         //     Stream.Read(8) != 'C' ||
>>         //     Stream.Read(4) != 0x0 ||
>>         //     Stream.Read(4) != 0xC ||
>>         //     Stream.Read(4) != 0xE ||
>>         //     Stream.Read(4) != 0xD
>>         //      ) {
>> ...
>>
>>
>>
>> The output of the code is :
>>
>>
>> isRawBitcode output:
>> B
>> C
>>
>>
>> BP != BE ok
>>
>> B ok
>> C ok
>> 0xc0 ok
>> 0xde ok
>>
>> parsebitcodeinto sniff the signature
>> B :37
>> B :37
>>
>>
>>
>>
>> It's possible that Stream object is not correctly initialized?
>>
>>
>> On 03/13/2014 06:37 PM, Will Dietz wrote:
>>
>>> On Thu, Mar 13, 2014 at 9:02 AM, Willy WOLFF <willy.wolff at etu.unistra.fr>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I having a weird problem while writing a bytecode module to a string,
>>>> and after read/parse it for unsing on a jit.
>>>>
>>>> I write a pass to export function to module, and put this module inside
>>>> a global variable.
>>>> I use WriteBitcodeToFile for this.
>>>> For debuging, after this write, I try to load the exported module with
>>>> parseBitcodeFile.
>>>> This two step works.
>>>>
>>>>
>>>>
>>>> After, while the compiled program is running, I try to read and parse
>>>> this global variable for jiting the function.
>>>>
>>>> 1) I read the global variable with
>>>>    StringRef sr (gv, gv_length);
>>>>
>>>> 2) I manually test this bytecode by
>>>> (inspired by  inline bool isRawBitcode(const unsigned char *BufPtr,
>>>> const unsigned char *BufEnd) at
>>>> http://llvm.org/docs/doxygen/html/ReaderWriter_8h_source.html#l00067)
>>>>    if (sr.str()[0] == 'B')
>>>>      std::cout << "B ok\n";
>>>>    if (sr.str()[1] == 'C')
>>>>      std::cout << "C ok\n";
>>>>    if (sr.str()[2] == (char) 0xc0)
>>>>      std::cout << "0xc0 ok\n";
>>>>    if (sr.str()[3] == (char) 0xde)
>>>>      std::cout << "0xde ok\n";
>>>>
>>>> 3) I try to parse the gv by
>>>>    MemoryBuffer* mbjit = MemoryBuffer::getMemBuffer (sr.str());
>>>>
>>>
>>> Not sure if this is your issue, but should be fixed anyway:
>>>
>>> The std::string created by "sr.str()" ends its lifetime in this
>>> statement, and MemoryBuffer for efficiency reasons
>>> avoids copying data it doesn't have to (like StringRef) so will be
>>> referencing the freed memory.
>>>
>>> To resolve this:
>>> * Pass MemoryBuffer your StringRef directly
>>> * Use getMemBufferCopy()
>>> * Preserve the result of sr.str() into a stack variable and pass that
>>> to getMemoryBuffer() instead.
>>>
>>> As a final note, check if your bitcode buffer "string" is
>>> null-terminated or not.  If not, be sure to be careful and
>>> do things like informing MemoryBuffer that this is the case.
>>>
>>> Hope this helps,
>>> ~Will
>>>
>>>     LLVMContext& context = getGlobalContext();
>>>>    ErrorOr<Module*> ModuleOrErr = parseBitcodeFile (mbjit, context);
>>>>    if (error_code EC = ModuleOrErr.getError())
>>>>    {
>>>>      std::cout << ModuleOrErr.getError().message() << "\n";
>>>>      assert(false);
>>>>    }
>>>>
>>>>
>>>>
>>>>
>>>> This is the execution result:
>>>> B ok
>>>> C ok
>>>> 0xc0 ok
>>>> 0xde ok
>>>> Invalid bitcode signature
>>>>
>>>>
>>>>
>>>> Ok is not working :/
>>>> But why ???
>>>>
>>>>
>>>>
>>>> For debuging, between 2) and 3), I export the readed module and write to
>>>> a file on my hard drive,
>>>> and try llvm-dis, and the dissasembly of the module works.
>>>>
>>>> Wath's wrong? Any idea for solve this problem?
>>>>
>>>> Thanks you very much.
>>>>
>>>> Regards,
>>>> Willy
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140319/f5cc5ca0/attachment.html>


More information about the llvm-dev mailing list