[LLVMdev] llvm-ranlib: Bus Error in regressions + fix
Reid Spencer
reid at x10sys.com
Tue Nov 22 14:18:47 PST 2005
Evan,
Your patch uses an operating system call that is not portable. All non-portable
code needs to be located in the lib/System library. I'm not sure why this
problem appears on an old Red Hat system. Perhaps the C++ io library is not up
to snuff on that platform? What compiler are you using?
Reid.
Evan Jones wrote:
> I ran the LLVM regression tests today (via make check) and noticed that
> llvm-ranlib crashes with a Bus Error on my test system (a fairly old
> RedHat 9 system), using the latest CVS version. I did some digging and I
> think I know what the problem is, and I have attached a quick and dirty
> patch that fixes the problem for me, but I need a suggestion about how
> it should be integrated properly. Here are the details:
>
> To reproduce the crash, run llvm-ranlib on the "GNU.a" file in the
> llvm/test/Regression/Archive directory (make a copy first: it corrupts
> it). It then crashes with a Bus Error.
>
> The stack trace is:
>
> #0 0x4207c1aa in memcpy () from /lib/tls/libc.so.6
> #1 0x400d55e8 in std::basic_streambuf<char, std::char_traits<char>
> >::xsputn(char const*, int) () from /usr/lib/libstdc++.so.5
> #2 0x4009c818 in std::basic_filebuf<char, std::char_traits<char>
> >::xsputn(char const*, int) () from /usr/lib/libstdc++.so.5
> #3 0x400cbed1 in std::ostream::write(char const*, int) ()
> from /usr/lib/libstdc++.so.5
> #4 0x0829c9d0 in llvm::Archive::writeMember(llvm::ArchiveMember const&,
> std::basic_ofstream<char, std::char_traits<char> >&, bool, bool, bool) (
> this=0x8356088, member=@0x8356180, ARFile=@0xbfffd630,
> CreateSymbolTable=false, TruncateNames=false, ShouldCompress=false)
> at ArchiveWriter.cpp:294
> #5 0x0829d297 in llvm::Archive::writeToDisk(bool, bool, bool) (
> this=0x8356088, CreateSymbolTable=true, TruncateNames=false,
> Compress=false) at ArchiveWriter.cpp:439
> #6 0x081a5618 in main (argc=2, argv=0xbfffd9b4) at llvm-ranlib.cpp:76
> #7 0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
>
>
> At frame #4 (Archive::writeMember) looks like this:
>
>> // Write the (possibly compressed) member's content to the file.
>> ARFile.write(data,fSize);
>
>
> If I examine the backtrace, fSize equals 46, and "data" points to 46
> null bytes. However, the "data" pointer is invalid, since if I inspect
> it *before* the crash, the crash does not occur.
>
> frame #5 (Archive::writeToDisk) looks like this:
>
>> // If there is a foreign symbol table, put it into the file now.
>> Most
>> // ar(1) implementations require the symbol table to be first
>> but llvm-ar
>> // can deal with it being after a foreign symbol table. This
>> ensures
>> // compatibility with other ar(1) implementations as well as
>> allowing the
>> // archive to store both native .o and LLVM .bc files, both
>> indexed.
>> if (foreignST) {
>> writeMember(*foreignST, FinalFile, false, false, false);
>> }
>
>
> So I tracked back the foreignST pointer, and when it is set the "data"
> pointer is *not* 46 null bytes. It is valid data mmap-ed from the
> archive file. But when it gets to the call to writeMember, that data
> pointer is no longer valid. Running "strace" on llvm-ranlib solved the
> mystery. Here are the relevant calls:
>
> open("temp.GNU.a", O_RDONLY) = 13
> fstat64(13, {st_mode=S_IFREG|0600, st_size=4210, ...}) = 0
> mmap2(NULL, 8192, PROT_READ, MAP_PRIVATE, 13, 0) = 0x40017000
>
> ** The source file is mapped, and a lot of stuff happens **
>
> open("temp.GNU.a", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 15
> fstat64(15, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
>
> ** Here the source file is TRUNCATED. Essentially, this invalidates the
> data pointer. Two lines follow in the trace: **
>
> _llseek(15, 0, [0], SEEK_CUR) = 0
> --- SIGBUS (Bus error) @ 0 (0) ---
>
>
>
> So the fix is pretty simple: before opening the file again, unlink it.
> This has the effect of creating a *new* file, instead of overwriting the
> old data. I've attached my quick-and-dirty patch that will only work on
> Unix. I'm not sure how this should be solved correctly. The other
> strange part is why hasn't anyone else seen this problem? I would think
> that this would occur pretty reliably on all systems. Any ideas?
>
> Evan Jones
>
>
> --
> Evan Jones
> http://evanjones.ca/
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list