[LLVMdev] llvm-ranlib: Bus Error in regressions + fix

Reid Spencer reid at x10sys.com
Tue Nov 22 14:18:47 PST 2005


Evan,

Your patch uses an operating system call that is not portable. All non-portable 
code needs to be located in the lib/System library. I'm not sure why this 
problem appears on an old Red Hat system. Perhaps the C++ io library is not up 
to snuff on that platform? What compiler are you using?

Reid.

Evan Jones wrote:

> I ran the LLVM regression tests today (via make check) and noticed that 
> llvm-ranlib crashes with a Bus Error on my test system (a fairly old 
> RedHat 9 system), using the latest CVS version. I did some digging and I 
> think I know what the problem is, and I have attached a quick and dirty 
> patch that fixes the problem for me, but I need a suggestion about how 
> it should be integrated properly. Here are the details:
> 
> To reproduce the crash, run llvm-ranlib on the "GNU.a" file in the 
> llvm/test/Regression/Archive directory (make a copy first: it corrupts 
> it). It then crashes with a Bus Error.
> 
> The stack trace is:
> 
> #0  0x4207c1aa in memcpy () from /lib/tls/libc.so.6
> #1  0x400d55e8 in std::basic_streambuf<char, std::char_traits<char> 
>  >::xsputn(char const*, int) () from /usr/lib/libstdc++.so.5
> #2  0x4009c818 in std::basic_filebuf<char, std::char_traits<char> 
>  >::xsputn(char const*, int) () from /usr/lib/libstdc++.so.5
> #3  0x400cbed1 in std::ostream::write(char const*, int) ()
>    from /usr/lib/libstdc++.so.5
> #4  0x0829c9d0 in llvm::Archive::writeMember(llvm::ArchiveMember const&, 
> std::basic_ofstream<char, std::char_traits<char> >&, bool, bool, bool) (
>     this=0x8356088, member=@0x8356180, ARFile=@0xbfffd630,
>     CreateSymbolTable=false, TruncateNames=false, ShouldCompress=false)
>     at ArchiveWriter.cpp:294
> #5  0x0829d297 in llvm::Archive::writeToDisk(bool, bool, bool) (
>     this=0x8356088, CreateSymbolTable=true, TruncateNames=false,
>     Compress=false) at ArchiveWriter.cpp:439
> #6  0x081a5618 in main (argc=2, argv=0xbfffd9b4) at llvm-ranlib.cpp:76
> #7  0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
> 
> 
> At frame #4 (Archive::writeMember) looks like this:
> 
>>   // Write the (possibly compressed) member's content to the file.
>>   ARFile.write(data,fSize);
> 
> 
> If I examine the backtrace, fSize equals 46, and "data" points to 46 
> null bytes. However, the "data" pointer is invalid, since if I inspect 
> it *before* the crash, the crash does not occur.
> 
> frame #5 (Archive::writeToDisk) looks like this:
> 
>>       // If there is a foreign symbol table, put it into the file now. 
>> Most
>>       // ar(1) implementations require the symbol table to be first 
>> but llvm-ar
>>       // can deal with it being after a foreign symbol table. This 
>> ensures
>>       // compatibility with other ar(1) implementations as well as 
>> allowing the
>>       // archive to store both native .o and LLVM .bc files, both 
>> indexed.
>>       if (foreignST) {
>>         writeMember(*foreignST, FinalFile, false, false, false);
>>       }
> 
> 
> So I tracked back the foreignST pointer, and when it is set the "data" 
> pointer is *not* 46 null bytes. It is valid data mmap-ed from the 
> archive file. But when it gets to the call to writeMember, that data 
> pointer is no longer valid. Running "strace" on llvm-ranlib solved the 
> mystery. Here are the relevant calls:
> 
> open("temp.GNU.a", O_RDONLY)            = 13
> fstat64(13, {st_mode=S_IFREG|0600, st_size=4210, ...}) = 0
> mmap2(NULL, 8192, PROT_READ, MAP_PRIVATE, 13, 0) = 0x40017000
> 
> ** The source file is mapped, and a lot of stuff happens **
> 
> open("temp.GNU.a", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 15
> fstat64(15, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
> 
> ** Here the source file is TRUNCATED. Essentially, this invalidates the 
> data pointer. Two lines follow in the trace: **
> 
> _llseek(15, 0, [0], SEEK_CUR)           = 0
> --- SIGBUS (Bus error) @ 0 (0) ---
> 
> 
> 
> So the fix is pretty simple: before opening the file again, unlink it. 
> This has the effect of creating a *new* file, instead of overwriting the 
> old data. I've attached my quick-and-dirty patch that will only work on 
> Unix. I'm not sure how this should be solved correctly. The other 
> strange part is why hasn't anyone else seen this problem? I would think 
> that this would occur pretty reliably on all systems. Any ideas?
> 
> Evan Jones
> 
> 
> -- 
> Evan Jones
> http://evanjones.ca/
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list