[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split

David Fang fang at csl.cornell.edu
Wed Mar 27 17:33:12 PDT 2013


Hi Daniel,
 	ping.  Have any time to review patch.2?

Fang

> Daniel and all,
> 	Attached is take 2 of the patch, just moved the edits from 
> MCAssembler to MachObjectWriter.  The two indirect symbol tests still get 
> their indirect symbols re-ordered, this patch includes those diffs.  Is this 
> closer to what you have in mind?
> 	On my handful of test cases, this still produces ordering that is 
> consistent with /usr/bin/as (vs. integrated-as).
>
> Fang
>
>>  On Tue, Mar 19, 2013 at 2:27 AM, David Fang <fang at csl.cornell.edu> wrote:
>> 
>> >  Hi Daniel,
>> >          Thanks for reviewing the patch.  I thought I tried suggestion #1
>> >  before this patch in 
>> >  http://llvm.org/bugs/show_bug.**cgi?id=14636#c33<http://llvm.org/bugs/show_bug.cgi?id=14636#c33>,
>> >  which only reordeed IndirectSymBase without reordering WriteObject, and
>> >  didn't quite produce the desired result.  I may revisit it again.
>> >          If I reorder IndirectSymBase, I need to also reorder in
>> >  WriteObject, so am I allowed to move the IndirectSymbol (mapped by 
>> >  section)
>> >  and IndirectSymbolSection (order of first appearance) structures out of
>> >  MCAssembler into MachObjectWriter?  I need the ordering information from
>> >  the first pass of BindIndirectSymbols to persist until WriteObject, 
>> >  without
>> >  having to sort twice.  This transformation is intended for X86,PPC
>> >  (probably ARM as well) +mach-o back-ends after all.
>> > 
>>
>>  I don't understand what the issue is here, you can of course add extra
>>  data
>>  structures into the MachObjectWriter class and compute them during
>>  BindIndirectSymbols() and reuse them during WriteObject(). You shouldn't
>>  need to change anything in the MCAssembler.
>>
>>  Basically I am just suggesting moving the building up of the Mach-O
>>  specific data structures into BindIndirectSymbols() and the
>>  MachObjectWriter class.
>>
>>  - Daniel
>> 
>> 
>> > 
>> >  Fang
>> > 
>> >   Hi David,
>> > > 
>> > >  Ok, here are my main high level comments on the patch. I have some
>> > >  specific
>> > >  nits too but I'll save them for later:
>> > > 
>> > >  #1. I would rather this patch not require modifying the MCAssembler 
>> > >  level
>> > >  interface. Instead, what I would like is for the new data structures
>> > >  (IndirectSymbols and IndirectSymbolSections) to get built up inside
>> > >  BindIndirectSymbols().
>> > > 
>> > >  #2. I don't think this patch should actually cause test cases to 
>> > >  change.
>> > >  Most of those test cases were generated based on matching existing 
>> > >  'as'
>> > >  output for x86, and the way I envisioned this patch working I wouldn't
>> > >  have
>> > >  expected it to change the order of symbols except in the case where 
>> > >  the
>> > >  indirect symbols would be out of order.
>> > > 
>> > >  I believe that if you fix #1 then #2 will fall out, because the 
>> > >  indirect
>> > >  symbol lists will get built up in the same order as they are now 
>> > >  (based on
>> > >  iterating the indirect symbol list twice).
>> > > 
>> > >  - Daniel
>> > > 
>> > > 
>> > >  On Thu, Mar 14, 2013 at 10:26 AM, David Fang <fang at csl.cornell.edu>
>> > >  wrote:
>> > > 
>> > >   Sure, Daniel, you seem pretty active, I can ping again in a week.
>> > > > 
>> > > >  patch notes: The patch is written so that changing a single #if in
>> > > >  MCAssembler.h at the top will effectively revert it (for ease of
>> > > >  testing).
>> > > >  A few typedefs for the new structures in MCAssembler are made public 
>> > > >  just
>> > > >  for convenience.  It looks like I'm deep in FIXME territory, judging 
>> > > >  by
>> > > >  the
>> > > >  comments in the touched files.  Feel free to pass a revised patch 
>> > > >  back to
>> > > >  me as you see fit.
>> > > > 
>> > > >  Fang
>> > > > 
>> > > >   Hi David,
>> > > > 
>> > > > > 
>> > > > >  Thanks for working on a patch, I'll take a look but it may take me 
>> > > > >  a
>> > > > >  couple
>> > > > >  days to get to it.
>> > > > > 
>> > > > >  - Daniel
>> > > > > 
>> > > > > 
>> > > > >  On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu>
>> > > > >  wrote:
>> > > > > 
>> > > > >   Daniel,
>> > > > > 
>> > > > > >          The resulting LLVM test diffs with my patch are shown 
>> > > > > >  here:
>> > > > > >  http://llvm.org/bugs/show_bug.******cgi?id=14636#c38<http://llvm.org/bugs/show_bug.****cgi?id=14636#c38>
>> > > > > >  <http://**llvm.org/bugs/show_bug.**cgi?**id=14636#c38<http://llvm.org/bugs/show_bug.**cgi?id=14636#c38>
>> > > > > > > 
>> > > > > >  <http://**llvm.org/bugs/show_**bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=**14636#c38>
>> > > > > >  <http://**llvm.org/bugs/show_bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>> > > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > > 
>> > > > > > 
>> > > > > >  2 tests "failed" in MC/MachO: indirect-symbols.s and 
>> > > > > >  symbol-indirect.s
>> > > > > >  No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>> > > > > > 
>> > > > > >  The differences are that symbols are re-ordered, which 
>> > > > > >  accomplishes the
>> > > > > >  objective of the patch. There are FIXME comments that require a
>> > > > > >  maintainer's scrutiny.
>> > > > > >  Could you review these differences that result from my patch?
>> > > > > >  The patch as-is now is in proof-of-concept form.  I'd be happy 
>> > > > > >  to
>> > > > > >  revise
>> > > > > >  as you suggest.
>> > > > > >  If this patch benefits all Mach-O architectures (getting closer 
>> > > > > >  to
>> > > > > >  usable
>> > > > > >  integrated assembly), I'd like to get this into trunk.
>> > > > > > 
>> > > > > >  Fang
>> > > > > > 
>> > > > > > 
>> > > > > >   Michael, Daniel,
>> > > > > > 
>> > > > > >           The attached patch I wrote works for me so far on my 
>> > > > > >           small
>> > > > > > >  test
>> > > > > > >  cases: llvm/clang's object file matches the indirect symbol 
>> > > > > > >  ordering
>> > > > > > >  of
>> > > > > > >  the
>> > > > > > >  /usr/bin/as-assembled object's (tested on my powerpc-darwin8 
>> > > > > > >  branch).
>> > > > > > >  Michael, could you try it out on your own test cases?
>> > > > > > >  Daniel, could you kindly comment and review?  Just the 
>> > > > > > >  technical
>> > > > > > >  merits,
>> > > > > > >  not style.
>> > > > > > >           I've also attached the same patch to bug 14636.
>> > > > > > > 
>> > > > > > >  Thanks for taking a look.
>> > > > > > > 
>> > > > > > >  Fang
>> > > > > > > 
>> > > > > > >    Daniel,
>> > > > > > > 
>> > > > > > >      Some quick questions about your proposed recipe:
>> > > > > > > > 
>> > > > > > > >     Ok, here is what I think the right fix is. Instead of 
>> > > > > > > >     creating the
>> > > > > > > > 
>> > > > > > > > >    IndirectSymBase mapping we use to associate sections 
>> > > > > > > > >    with their
>> > > > > > > > > >    indirect offset start, in BindIndirectSymbols() we 
>> > > > > > > > > >    should:
>> > > > > > > > > > 
>> > > > > > > > > >     1. Add a simple container struct (lets say > >
>> > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > >   MachSectionIndirectSymbols)
>> > > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > >     for tracking the per-section list of indirect symbols. It 
>> > > > > > > >     will
>> > > > > > > > >  keep
>> > > > > > > > > 
>> > > > > > > > > >    the list of symbols in the section and the index of 
>> > > > > > > > > >    the first > >
>> > > > > > > > > > 
>> > > > > > > > > >     indirect
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > >     symbol in the section.
>> > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > >    Like a map<section*, vector<indirect_symbol> >?
>> > > > > > > > 
>> > > > > > > >  Instead of keeping track of the first indirect symbol index, 
>> > > > > > > >  could
>> > > > > > > >  we
>> > > > > > > >  just
>> > > > > > > >   as easily compute the index offsets later during the second 
>> > > > > > > >   pass:
>> > > > > > > > 
>> > > > > > > >   offset = 0;
>> > > > > > > >   for each section S (in order)
>> > > > > > > >     record offset for section S
>> > > > > > > >     [optional] write vector of indirect symbols for S
>> > > > > > > >     offset += S.num_indirect_symbols();
>> > > > > > > >   end for
>> > > > > > > > 
>> > > > > > > >  I was able to fix the reserve1 field by doing something like 
>> > > > > > > >  this
>> > > > > > > >  (bug
>> > > > > > > >  14636, comments 30-33), but I only later discovered that the
>> > > > > > > >  indirect
>> > > > > > > >   symbols needed to be reordered.
>> > > > > > > > 
>> > > > > > > >     2. Keep a mapping from sections to the above type.
>> > > > > > > > 
>> > > > > > > > >    3. Add a SetVector to record the order of the sections 
>> > > > > > > > >    that have
>> > > > > > > > > >    indirect symbols.
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > >    This I understand, I found SetVector and SmallSetVector 
>> > > > > > > > >    in ADT.
>> > > > > > > > 
>> > > > > > > >     4. During BindIndirectSymbols() we maintain the above 
>> > > > > > > >     information
>> > > > > > > > 
>> > > > > > > > >    (populating the MachSectionIndirectSymbols per-section 
>> > > > > > > > >    symbol > >
>> > > > > > > > > > 
>> > > > > > > > > >     arrays).
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > >  BindIndirectSymbols looks like it need not change, because 
>> > > > > > > >  it's
>> > > > > > > >  already
>> > > > > > > >   operating on the assumption of ordered indirect symbols, 
>> > > > > > > >   right?
>> > > > > > > > 
>> > > > > > > >     5. Once we have scanned all the symbols we make another 
>> > > > > > > >     pass over
>> > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > >   the
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > >     sections (in the order seen via indirect symbols) and 
>> > > > > > > >     assign the >
>> > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > >     start
>> > > > > > > > > 
>> > > > > > > > >     indices.
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > >    as I outlined above (offset = start indices).
>> > > > > > > > 
>> > > > > > > >     6. Update writing of the indirect symbol table to write 
>> > > > > > > >  in the
>> > > > > > > >  same
>> > > > > > > > 
>> > > > > > > > >    order as traversed in #5.
>> > > > > > > > > > 
>> > > > > > > > > >     Does that make sense? It's more work than your patch 
>> > > > > > > > > >     but it
>> > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > >   (a) > >   should
>> > > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > >     preserve binary compatibility with 'as' in situations 
>> > > > > > > >     where the
>> > > > > > > > > 
>> > > > > > > > > >    indirect symbols don't appear out of order w.r.t. the 
>> > > > > > > > > >    sections,
>> > > > > > > > > > 
>> > > > > > > > > >   (b) > >   it
>> > > > > > > > > 
>> > > > > > > > 
>> > > > > > > >     makes somewhere more explicit the relationship between 
>> > > > > > > >     sections
>> > > > > > > > >  and
>> > > > > > > > > 
>> > > > > > > > > >    their list of symbols being in contiguous order.
>> > > > > > > > > > 
>> > > > > > > > > >     - Daniel
>> > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > > > 
>> > > > > > > > > > >    Will this re-ordering be acceptable to all Mach-O 
>> > > > > > > > > > >    back-ends?
>> > > > > > > >  x86,
>> > > > > > > >  ARM,
>> > > > > > > >   PPC?
>> > > > > > > > 
>> > > > > > > >   Fang
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > >   --
>> > > > > > > 
>> > > > > >  David Fang
>> > > > > >  http://www.csl.cornell.edu/~******fang/<http://www.csl.cornell.edu/~****fang/>
>> > > > > >  <http://www.csl.**cornell.edu/~**fang/<http://www.csl.cornell.edu/~**fang/>
>> > > > > > >  <
>> > > > > > http: //www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>> > > > > > http: //www.csl.cornell.edu/~**fang/<http://www.csl.cornell.edu/~fang/>
>> > > > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > > > 
>> > > > >   --
>> > > >  David Fang
>> > > > http: //www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>> > > > http: //www.csl.cornell.edu/~**fang/ 
>> > > > http: <http://www.csl.cornell.edu/~fang/>>
>> > > > 
>> > > > 
>> > > > 
>> > > 
>> >  --
>> >  David Fang
>> >  http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>> > 
>> > 
>> 
>
>

-- 
David Fang
http://www.csl.cornell.edu/~fang/




More information about the llvm-commits mailing list