[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split
David Fang
fang at csl.cornell.edu
Wed Mar 27 17:33:12 PDT 2013
Hi Daniel,
ping. Have any time to review patch.2?
Fang
> Daniel and all,
> Attached is take 2 of the patch, just moved the edits from
> MCAssembler to MachObjectWriter. The two indirect symbol tests still get
> their indirect symbols re-ordered, this patch includes those diffs. Is this
> closer to what you have in mind?
> On my handful of test cases, this still produces ordering that is
> consistent with /usr/bin/as (vs. integrated-as).
>
> Fang
>
>> On Tue, Mar 19, 2013 at 2:27 AM, David Fang <fang at csl.cornell.edu> wrote:
>>
>> > Hi Daniel,
>> > Thanks for reviewing the patch. I thought I tried suggestion #1
>> > before this patch in
>> > http://llvm.org/bugs/show_bug.**cgi?id=14636#c33<http://llvm.org/bugs/show_bug.cgi?id=14636#c33>,
>> > which only reordeed IndirectSymBase without reordering WriteObject, and
>> > didn't quite produce the desired result. I may revisit it again.
>> > If I reorder IndirectSymBase, I need to also reorder in
>> > WriteObject, so am I allowed to move the IndirectSymbol (mapped by
>> > section)
>> > and IndirectSymbolSection (order of first appearance) structures out of
>> > MCAssembler into MachObjectWriter? I need the ordering information from
>> > the first pass of BindIndirectSymbols to persist until WriteObject,
>> > without
>> > having to sort twice. This transformation is intended for X86,PPC
>> > (probably ARM as well) +mach-o back-ends after all.
>> >
>>
>> I don't understand what the issue is here, you can of course add extra
>> data
>> structures into the MachObjectWriter class and compute them during
>> BindIndirectSymbols() and reuse them during WriteObject(). You shouldn't
>> need to change anything in the MCAssembler.
>>
>> Basically I am just suggesting moving the building up of the Mach-O
>> specific data structures into BindIndirectSymbols() and the
>> MachObjectWriter class.
>>
>> - Daniel
>>
>>
>> >
>> > Fang
>> >
>> > Hi David,
>> > >
>> > > Ok, here are my main high level comments on the patch. I have some
>> > > specific
>> > > nits too but I'll save them for later:
>> > >
>> > > #1. I would rather this patch not require modifying the MCAssembler
>> > > level
>> > > interface. Instead, what I would like is for the new data structures
>> > > (IndirectSymbols and IndirectSymbolSections) to get built up inside
>> > > BindIndirectSymbols().
>> > >
>> > > #2. I don't think this patch should actually cause test cases to
>> > > change.
>> > > Most of those test cases were generated based on matching existing
>> > > 'as'
>> > > output for x86, and the way I envisioned this patch working I wouldn't
>> > > have
>> > > expected it to change the order of symbols except in the case where
>> > > the
>> > > indirect symbols would be out of order.
>> > >
>> > > I believe that if you fix #1 then #2 will fall out, because the
>> > > indirect
>> > > symbol lists will get built up in the same order as they are now
>> > > (based on
>> > > iterating the indirect symbol list twice).
>> > >
>> > > - Daniel
>> > >
>> > >
>> > > On Thu, Mar 14, 2013 at 10:26 AM, David Fang <fang at csl.cornell.edu>
>> > > wrote:
>> > >
>> > > Sure, Daniel, you seem pretty active, I can ping again in a week.
>> > > >
>> > > > patch notes: The patch is written so that changing a single #if in
>> > > > MCAssembler.h at the top will effectively revert it (for ease of
>> > > > testing).
>> > > > A few typedefs for the new structures in MCAssembler are made public
>> > > > just
>> > > > for convenience. It looks like I'm deep in FIXME territory, judging
>> > > > by
>> > > > the
>> > > > comments in the touched files. Feel free to pass a revised patch
>> > > > back to
>> > > > me as you see fit.
>> > > >
>> > > > Fang
>> > > >
>> > > > Hi David,
>> > > >
>> > > > >
>> > > > > Thanks for working on a patch, I'll take a look but it may take me
>> > > > > a
>> > > > > couple
>> > > > > days to get to it.
>> > > > >
>> > > > > - Daniel
>> > > > >
>> > > > >
>> > > > > On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu>
>> > > > > wrote:
>> > > > >
>> > > > > Daniel,
>> > > > >
>> > > > > > The resulting LLVM test diffs with my patch are shown
>> > > > > > here:
>> > > > > > http://llvm.org/bugs/show_bug.******cgi?id=14636#c38<http://llvm.org/bugs/show_bug.****cgi?id=14636#c38>
>> > > > > > <http://**llvm.org/bugs/show_bug.**cgi?**id=14636#c38<http://llvm.org/bugs/show_bug.**cgi?id=14636#c38>
>> > > > > > >
>> > > > > > <http://**llvm.org/bugs/show_**bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=**14636#c38>
>> > > > > > <http://**llvm.org/bugs/show_bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > >
>> > > > > > 2 tests "failed" in MC/MachO: indirect-symbols.s and
>> > > > > > symbol-indirect.s
>> > > > > > No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>> > > > > >
>> > > > > > The differences are that symbols are re-ordered, which
>> > > > > > accomplishes the
>> > > > > > objective of the patch. There are FIXME comments that require a
>> > > > > > maintainer's scrutiny.
>> > > > > > Could you review these differences that result from my patch?
>> > > > > > The patch as-is now is in proof-of-concept form. I'd be happy
>> > > > > > to
>> > > > > > revise
>> > > > > > as you suggest.
>> > > > > > If this patch benefits all Mach-O architectures (getting closer
>> > > > > > to
>> > > > > > usable
>> > > > > > integrated assembly), I'd like to get this into trunk.
>> > > > > >
>> > > > > > Fang
>> > > > > >
>> > > > > >
>> > > > > > Michael, Daniel,
>> > > > > >
>> > > > > > The attached patch I wrote works for me so far on my
>> > > > > > small
>> > > > > > > test
>> > > > > > > cases: llvm/clang's object file matches the indirect symbol
>> > > > > > > ordering
>> > > > > > > of
>> > > > > > > the
>> > > > > > > /usr/bin/as-assembled object's (tested on my powerpc-darwin8
>> > > > > > > branch).
>> > > > > > > Michael, could you try it out on your own test cases?
>> > > > > > > Daniel, could you kindly comment and review? Just the
>> > > > > > > technical
>> > > > > > > merits,
>> > > > > > > not style.
>> > > > > > > I've also attached the same patch to bug 14636.
>> > > > > > >
>> > > > > > > Thanks for taking a look.
>> > > > > > >
>> > > > > > > Fang
>> > > > > > >
>> > > > > > > Daniel,
>> > > > > > >
>> > > > > > > Some quick questions about your proposed recipe:
>> > > > > > > >
>> > > > > > > > Ok, here is what I think the right fix is. Instead of
>> > > > > > > > creating the
>> > > > > > > >
>> > > > > > > > > IndirectSymBase mapping we use to associate sections
>> > > > > > > > > with their
>> > > > > > > > > > indirect offset start, in BindIndirectSymbols() we
>> > > > > > > > > > should:
>> > > > > > > > > >
>> > > > > > > > > > 1. Add a simple container struct (lets say > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > MachSectionIndirectSymbols)
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > for tracking the per-section list of indirect symbols. It
>> > > > > > > > will
>> > > > > > > > > keep
>> > > > > > > > >
>> > > > > > > > > > the list of symbols in the section and the index of
>> > > > > > > > > > the first > >
>> > > > > > > > > >
>> > > > > > > > > > indirect
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > symbol in the section.
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > Like a map<section*, vector<indirect_symbol> >?
>> > > > > > > >
>> > > > > > > > Instead of keeping track of the first indirect symbol index,
>> > > > > > > > could
>> > > > > > > > we
>> > > > > > > > just
>> > > > > > > > as easily compute the index offsets later during the second
>> > > > > > > > pass:
>> > > > > > > >
>> > > > > > > > offset = 0;
>> > > > > > > > for each section S (in order)
>> > > > > > > > record offset for section S
>> > > > > > > > [optional] write vector of indirect symbols for S
>> > > > > > > > offset += S.num_indirect_symbols();
>> > > > > > > > end for
>> > > > > > > >
>> > > > > > > > I was able to fix the reserve1 field by doing something like
>> > > > > > > > this
>> > > > > > > > (bug
>> > > > > > > > 14636, comments 30-33), but I only later discovered that the
>> > > > > > > > indirect
>> > > > > > > > symbols needed to be reordered.
>> > > > > > > >
>> > > > > > > > 2. Keep a mapping from sections to the above type.
>> > > > > > > >
>> > > > > > > > > 3. Add a SetVector to record the order of the sections
>> > > > > > > > > that have
>> > > > > > > > > > indirect symbols.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > This I understand, I found SetVector and SmallSetVector
>> > > > > > > > > in ADT.
>> > > > > > > >
>> > > > > > > > 4. During BindIndirectSymbols() we maintain the above
>> > > > > > > > information
>> > > > > > > >
>> > > > > > > > > (populating the MachSectionIndirectSymbols per-section
>> > > > > > > > > symbol > >
>> > > > > > > > > >
>> > > > > > > > > > arrays).
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > BindIndirectSymbols looks like it need not change, because
>> > > > > > > > it's
>> > > > > > > > already
>> > > > > > > > operating on the assumption of ordered indirect symbols,
>> > > > > > > > right?
>> > > > > > > >
>> > > > > > > > 5. Once we have scanned all the symbols we make another
>> > > > > > > > pass over
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > the
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > sections (in the order seen via indirect symbols) and
>> > > > > > > > assign the >
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > start
>> > > > > > > > >
>> > > > > > > > > indices.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > as I outlined above (offset = start indices).
>> > > > > > > >
>> > > > > > > > 6. Update writing of the indirect symbol table to write
>> > > > > > > > in the
>> > > > > > > > same
>> > > > > > > >
>> > > > > > > > > order as traversed in #5.
>> > > > > > > > > >
>> > > > > > > > > > Does that make sense? It's more work than your patch
>> > > > > > > > > > but it
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > (a) > > should
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > preserve binary compatibility with 'as' in situations
>> > > > > > > > where the
>> > > > > > > > >
>> > > > > > > > > > indirect symbols don't appear out of order w.r.t. the
>> > > > > > > > > > sections,
>> > > > > > > > > >
>> > > > > > > > > > (b) > > it
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > > makes somewhere more explicit the relationship between
>> > > > > > > > sections
>> > > > > > > > > and
>> > > > > > > > >
>> > > > > > > > > > their list of symbols being in contiguous order.
>> > > > > > > > > >
>> > > > > > > > > > - Daniel
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > Will this re-ordering be acceptable to all Mach-O
>> > > > > > > > > > > back-ends?
>> > > > > > > > x86,
>> > > > > > > > ARM,
>> > > > > > > > PPC?
>> > > > > > > >
>> > > > > > > > Fang
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > David Fang
>> > > > > > http://www.csl.cornell.edu/~******fang/<http://www.csl.cornell.edu/~****fang/>
>> > > > > > <http://www.csl.**cornell.edu/~**fang/<http://www.csl.cornell.edu/~**fang/>
>> > > > > > > <
>> > > > > > http: //www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>> > > > > > http: //www.csl.cornell.edu/~**fang/<http://www.csl.cornell.edu/~fang/>
>> > > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > --
>> > > > David Fang
>> > > > http: //www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>> > > > http: //www.csl.cornell.edu/~**fang/
>> > > > http: <http://www.csl.cornell.edu/~fang/>>
>> > > >
>> > > >
>> > > >
>> > >
>> > --
>> > David Fang
>> > http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>> >
>> >
>>
>
>
--
David Fang
http://www.csl.cornell.edu/~fang/
More information about the llvm-commits
mailing list