[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split
David Fang
fang at csl.cornell.edu
Tue Mar 19 02:27:42 PDT 2013
Hi Daniel,
Thanks for reviewing the patch. I thought I tried suggestion #1
before this patch in http://llvm.org/bugs/show_bug.cgi?id=14636#c33, which
only reordeed IndirectSymBase without reordering WriteObject, and didn't
quite produce the desired result. I may revisit it again.
If I reorder IndirectSymBase, I need to also reorder in
WriteObject, so am I allowed to move the IndirectSymbol (mapped by
section) and IndirectSymbolSection (order of first appearance) structures
out of MCAssembler into MachObjectWriter? I need the ordering information
from the first pass of BindIndirectSymbols to persist until WriteObject,
without having to sort twice. This transformation is intended for X86,PPC
(probably ARM as well) +mach-o back-ends after all.
Fang
> Hi David,
>
> Ok, here are my main high level comments on the patch. I have some specific
> nits too but I'll save them for later:
>
> #1. I would rather this patch not require modifying the MCAssembler level
> interface. Instead, what I would like is for the new data structures
> (IndirectSymbols and IndirectSymbolSections) to get built up inside
> BindIndirectSymbols().
>
> #2. I don't think this patch should actually cause test cases to change.
> Most of those test cases were generated based on matching existing 'as'
> output for x86, and the way I envisioned this patch working I wouldn't have
> expected it to change the order of symbols except in the case where the
> indirect symbols would be out of order.
>
> I believe that if you fix #1 then #2 will fall out, because the indirect
> symbol lists will get built up in the same order as they are now (based on
> iterating the indirect symbol list twice).
>
> - Daniel
>
>
> On Thu, Mar 14, 2013 at 10:26 AM, David Fang <fang at csl.cornell.edu> wrote:
>
>> Sure, Daniel, you seem pretty active, I can ping again in a week.
>>
>> patch notes: The patch is written so that changing a single #if in
>> MCAssembler.h at the top will effectively revert it (for ease of testing).
>> A few typedefs for the new structures in MCAssembler are made public just
>> for convenience. It looks like I'm deep in FIXME territory, judging by the
>> comments in the touched files. Feel free to pass a revised patch back to
>> me as you see fit.
>>
>> Fang
>>
>> Hi David,
>>>
>>> Thanks for working on a patch, I'll take a look but it may take me a
>>> couple
>>> days to get to it.
>>>
>>> - Daniel
>>>
>>>
>>> On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu> wrote:
>>>
>>> Daniel,
>>>> The resulting LLVM test diffs with my patch are shown here:
>>>> http://llvm.org/bugs/show_bug.****cgi?id=14636#c38<http://llvm.org/bugs/show_bug.**cgi?id=14636#c38>
>>>> <http://**llvm.org/bugs/show_bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>>>>>
>>>>
>>>>
>>>> 2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
>>>> No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>>>>
>>>> The differences are that symbols are re-ordered, which accomplishes the
>>>> objective of the patch. There are FIXME comments that require a
>>>> maintainer's scrutiny.
>>>> Could you review these differences that result from my patch?
>>>> The patch as-is now is in proof-of-concept form. I'd be happy to revise
>>>> as you suggest.
>>>> If this patch benefits all Mach-O architectures (getting closer to usable
>>>> integrated assembly), I'd like to get this into trunk.
>>>>
>>>> Fang
>>>>
>>>>
>>>> Michael, Daniel,
>>>>
>>>>> The attached patch I wrote works for me so far on my small test
>>>>> cases: llvm/clang's object file matches the indirect symbol ordering of
>>>>> the
>>>>> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
>>>>> Michael, could you try it out on your own test cases?
>>>>> Daniel, could you kindly comment and review? Just the technical merits,
>>>>> not style.
>>>>> I've also attached the same patch to bug 14636.
>>>>>
>>>>> Thanks for taking a look.
>>>>>
>>>>> Fang
>>>>>
>>>>> Daniel,
>>>>>
>>>>>> Some quick questions about your proposed recipe:
>>>>>>
>>>>>> Ok, here is what I think the right fix is. Instead of creating the
>>>>>>>> IndirectSymBase mapping we use to associate sections with their
>>>>>>>> indirect offset start, in BindIndirectSymbols() we should:
>>>>>>>>
>>>>>>>>> 1. Add a simple container struct (lets say > >
>>>>>>>>>>
>>>>>>>>> MachSectionIndirectSymbols)
>>>>>>
>>>>>>> for tracking the per-section list of indirect symbols. It will keep
>>>>>>>> the list of symbols in the section and the index of the first > >
>>>>>>>>
>>>>>>> indirect
>>>>>>
>>>>>>> symbol in the section.
>>>>>>>>
>>>>>>>
>>>>>> Like a map<section*, vector<indirect_symbol> >?
>>>>>>
>>>>>> Instead of keeping track of the first indirect symbol index, could we
>>>>>> just
>>>>>> as easily compute the index offsets later during the second pass:
>>>>>>
>>>>>> offset = 0;
>>>>>> for each section S (in order)
>>>>>> record offset for section S
>>>>>> [optional] write vector of indirect symbols for S
>>>>>> offset += S.num_indirect_symbols();
>>>>>> end for
>>>>>>
>>>>>> I was able to fix the reserve1 field by doing something like this (bug
>>>>>> 14636, comments 30-33), but I only later discovered that the indirect
>>>>>> symbols needed to be reordered.
>>>>>>
>>>>>> 2. Keep a mapping from sections to the above type.
>>>>>>>> 3. Add a SetVector to record the order of the sections that have
>>>>>>>> indirect symbols.
>>>>>>>>
>>>>>>>
>>>>>> This I understand, I found SetVector and SmallSetVector in ADT.
>>>>>>
>>>>>> 4. During BindIndirectSymbols() we maintain the above information
>>>>>>>> (populating the MachSectionIndirectSymbols per-section symbol > >
>>>>>>>>
>>>>>>> arrays).
>>>>>>
>>>>>> BindIndirectSymbols looks like it need not change, because it's
>>>>>> already
>>>>>> operating on the assumption of ordered indirect symbols, right?
>>>>>>
>>>>>> 5. Once we have scanned all the symbols we make another pass over
>>>>>>>>
>>>>>>> the
>>>>>>
>>>>>>> sections (in the order seen via indirect symbols) and assign the >
>>>>>>>>
>>>>>>> start
>>>>>>>
>>>>>>>> indices.
>>>>>>>>
>>>>>>>
>>>>>> as I outlined above (offset = start indices).
>>>>>>
>>>>>> 6. Update writing of the indirect symbol table to write in the same
>>>>>>>> order as traversed in #5.
>>>>>>>>
>>>>>>>>> Does that make sense? It's more work than your patch but it
>>>>>>>>>>
>>>>>>>>> (a) > > should
>>>>>>
>>>>>>> preserve binary compatibility with 'as' in situations where the
>>>>>>>> indirect symbols don't appear out of order w.r.t. the sections,
>>>>>>>>
>>>>>>> (b) > > it
>>>>>>
>>>>>>> makes somewhere more explicit the relationship between sections and
>>>>>>>> their list of symbols being in contiguous order.
>>>>>>>>
>>>>>>>>> - Daniel
>>>>>>>>>>
>>>>>>>>>
>>>>>> Will this re-ordering be acceptable to all Mach-O back-ends? x86,
>>>>>> ARM,
>>>>>> PPC?
>>>>>>
>>>>>> Fang
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>> David Fang
>>>> http://www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>>>> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>>
>>>>
>>>>
>>>>
>>>
>> --
>> David Fang
>> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>>
>>
>
--
David Fang
http://www.csl.cornell.edu/~fang/
More information about the llvm-commits
mailing list