[LLVMbugs] [Bug 12891] New: Mach-O/i386: Indirect symbol incorrectly used/defined for symbol defined with .zerofill

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Sat May 19 15:27:11 PDT 2012


http://llvm.org/bugs/show_bug.cgi?id=12891

             Bug #: 12891
           Summary: Mach-O/i386: Indirect symbol incorrectly used/defined
                    for symbol defined with .zerofill
           Product: libraries
           Version: trunk
          Platform: Macintosh
        OS/Version: MacOS X
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: cdavis at mymail.mines.edu
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


Created attachment 8596
  --> http://llvm.org/bugs/attachment.cgi?id=8596
IR demonstrating indirect symbol issue

On Darwin, if you define a symbol with .zerofill in module-level assembly:

asm(".zerofill SEG, SECT, _bigzerofill, 0x40000000\n");

and then refer to that symbol with an extern declaration:

extern char bigzerofill[0x40000000];
void do_something(void *pointer);

...

do_something(bigzerofill);

LLVM generates this code (i386) for it:

    movl  L_bigzerofill$non_lazy_ptr, %edi
    movl  %edi, (%esp)
    calll _do_something

...

    .section __IMPORT,__pointers,non_lazy_symbol_pointers
L_bigzerofill$non_lazy_ptr:
    .indirect_symbol _bigzerofill

When the assembler--either the system assembler or LLVM's integrated
assembler--goes to resolve it, the indirect symbol resolves to the beginning of
the text section, and not the symbol that was defined with the .zerofill
directive. Therefore, when the app uses the non-lazy pointer at runtime, it
gets the wrong address--and all sorts of weird hijinks ensue. This is
particularly bad if the app, say, tries to mmap(2) a block of anonymous memory
at that address: the app will actually *overwrite its own text and data
segments*. In fact, this is why Wine, compiled at -O2 on Mac OS X, doesn't
work: it does exactly that, overwriting the environ pointer and causing a crash
when it tries to read the environment later.

I don't know exactly where the bug is. Is it that we're generating an indirect
symbol for this at all? Or is it that the assembler is resolving the indirect
symbol wrong?

I've attached a file containing some LLVM IR that I reduced from Wine. You can
reproduce this like so:

  $ llc -filetype=asm mach-o-i386-bad-indirect-symbol.ll -o
mach-o-i386-bad-indirect-symbol.s
  $ as -arch i386 mach-o-i386-bad-indirect-symbol.s -o
mach-o-i386-bad-indirect-symbol.o

or:

  $ llc -filetype=obj mach-o-i386-bad-indirect-symbol.ll -o
mach-o-i386-bad-indirect-symbol.o

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list