[llvm] r211500 - Stop producing func.eh symbols on Darwin.

Iain Sandoe iain at codesourcery.com
Mon Jul 21 05:41:33 PDT 2014


Hi Nick,

On 20 Jul 2014, at 00:17, Nick Kledzik wrote:

> 
> On Jul 19, 2014, at 12:55 AM, Iain Sandoe <iain at codesourcery.com> wrote:
> 
>> Hi Nick, Rafael,
>> 
>> On 19 Jul 2014, at 00:29, Nick Kledzik wrote:
>> 
>>> I’ve been looking at __eh_frame processing today for lld so this is fresh in my mind.
>>> 
>>> Some background
>>> ------------------------
>>> For MacOSX 10.6, we (Apple) rolled out “compact unwind  info” as a way to reduce the size and runtime cost of C++ exceptions.  But because the transition of gcc->llvmgcc->clang was in progress, we did not get compiler support for this.  Instead the linker interpreted any dwarf unwind info it saw and if it seemed safe, converted that to compact unwind.  
>>> 
>>> Now that Apple uses just clang which can generate compact unwind, we want to get the linker out of the business of interpreting dwarf unwind info.  But if the linker just stopped doing the conversion, existing .o files (i.e. in archives) would regress.  So, instead the linker checks if the compact unwind section exists (__LD,__compact_unwind).  If not, it is an old .o file and the linker should convert any dwarf unwind info.  If the linker sees both compact unwind and dwarf unwind for a function, it picks the compact unwind and discards the dwarf. 
>>> 
>>> The linker can parse the __eh_frame section because it is a well known format.  The linker does not need any labels in the section because the size of each CFI chunk is encoded in the chunk header.  The linker does not need relocations because it can interpret the bytes in each CFI that “point” to the CFI or function. How the linker does this is if the __eh_frame section has no relocations, it just parses the section as-is.  If there are relocations, it makes a copy of the section and has a special case piece of code to “apply” the relocations to the copy, then parses that.
>>> 
>>> Current problem
>>> ----------------------
>>> The issue Steven is looking at is that various optimisations have been added to LLVM to remove the .eh labels and remove the relocations.  But the current state is busted for x86_64 when -mmacosx-version-min=10.5 (or earlier).  The .eh labels are gone (good), but the relocations are still used.  But because the labels are gone, the usual SUBSTRACTOR/UNSIGNED external relocation pair cannot be used.  The relocations that are used is a combination that the mini dwarf interpreter in ld64 cannot handle.
>>> 
>>> I suspect that the code in LLVM which suppressed the relocations on the __eh_frame section is disabled for older OS (when there might be an older linker).  That same check should be done for the removal of the .eh labels.  In other words:
>>> 
>>> for 10.5 and earlier:  Have .eh labels and relocations on __eh_frame
>> 
>> FWIW, upstream GCC does not emit the .eh symbols when generating code for 10.5 (ld64-85.2.1**), and has not for a long time.  However it does ensure that a linker-visible symbol marks the __eh_frame start.  This saves the .eh symbols but still provides the necessary information to generate valid subtraction relocs.  Of course, on upstream GCC, the eh frames are being emitted as tables - and the equivalent scheme might not be so easy to achieve with .cfi_xxx.
>> 
>> This strategy corresponds to the ld from XCode 3.1.4 and has been well-exercised for a number of years.
>> 
>> For earlier OSX, upstream GCC still emits the .eh labels (although I'm not aware of anyone actually testing anything earlier than 10.4).
>> 
>> I guess, that if we want to be able to generate objects that can be linked by the XCode 2.5 linker, then we'd need to keep this pattern.
> Thanks for the info.
> 
>> 
>> Note1: that, regardless of ld64 support, other tools like, nmm, ranlib etc. will be broken by the macosx_version_min load command (which is now placed in MH_OBJECTS, regardless of the target OSX version).
> Ugh!  LC_VERSION_MIN_MACOSX has been around for a few years (in final linked images).  What version of those tools (nm, ranlib) have problems with LC_VERSION_MIN_MACOSX?  

cctools from XCode <= 3.2.6 (last public release on 10.6) [ar will not build archives, for example]

> Who is still using those old versions?

Anyone who wants to be able to use llvm-produced MH_OBJECTS on 10.6 or earlier - see below**

>  And if the LLVM project produced nm, ranlib, etc would that make it easier to update to newer tools? 

I have seen that Kevin is doing this which is much appreciated (yes, in the end, it will make things easier) see **

>> Note2: It would be nice to get the target linker version (which is available in CompilerInvocation) passed to the back end so that we can do codegen based on the requirements of the linker, rather than guessing from the target OSX version.  Then if someone really wants to generate objects / convenience libs that can be consumed by XCode 3.1.4/2.5 they can put -mtarget-linker on the command line.  If they just want to generate an exe that will run on the earlier system, then they can take advantage of whatever up-to-date linker capabilites are present.
> That infrastructure would have been nice five years ago.  But at this point we want to switch over to using lld as the “system linker” for MacOSX.  In that model, the linker is built with the compiler, so they are always paired and thus there is no need for checking the linker version.

This is great and also much appreciated (perhaps the #8013 bots could have an lld entry, since I implemented the Makefiles?)
I build & test lld along with the other components now on 10.5, 10.6, (10.7 occasionally), 10.8 and 10.9.

However, we're probably still some way away from that integrated system tho, right?

-----

** The resolution of the issues about older toolchains ===

I guess it depends on objectives - some possibilities are:

(a) People want to be able to produce exes that will run on 10.5 and 10.6 - but they are OK with using a modern box and OS as the toolchain host.
  - in this case the only constraint is that the final exes/dylibs are compatible with the target dyld.
  - for the particular issue discussed, perhaps one could teach ld64 to recognise the LC_VERSION_MIN_MACOSX as an indication that it could apply the "post 10.6" rules to x86-64 eh_frame sections (i'm considering doing this for my composite toolset).

(b) People want to achieve (a) by self-hosting on the earlier systems.
  - my personal opinion is that, realistically, this requires an updated toolchain including newer cctools, ld64 and (if one wants an assembler) probably GAS, since .cfi_xxxx are mandatory now.
  - I have a mostly working toolset [back-ported cctools 855 + ld64-127.2+additions and fixes + x86 and PPC darwin GAS ports] for this, but it needs tidying before public posting.
  - if the route above is taken, then we could just stick with using modern object files (if one is building a self-hosting toolchain, then it might as well be complete).
  ^^ just my 0.02GBP - packagers might have a different perspective.

(c) People want to be able to produce MH_OBJECTS/convenience libs that can be used with the stock XCode releases (3.2.6, 3.1.4 and 2.5) on 10.6,5,4 ..
  - in this case there is no alternative to altering code-gen to produce MH_OBJECTS that would be understood by the target.

====

My feel is that (a) is probably the most likely endgame - but building cross toolchains is harder for folks to deal with in general, so in the current phase of llvm/clang (b) still has some firm attractions.

To achieve (b) one needs to bootstrap via a second (c++11) compiler.  This is feasible using gcc-4.8/9 right now (although that's another compiler to build).  clang/llvm 3.4 with some backported fixes is possibly also feasible, but that still requires some tweaking to external toolchain components.

===

I've been working on finishing the PPC Darwin port, and this weekend I managed to build an optimised stage#2 ppc-darwin9 clang/llvm for the first time -  using the toolchain additions above - it's undoubtedly rough around the edges, but quite promising (at least for 3.6/3.5.1).  
i686-darwin9 and *-darwin10 already self-host 3.5 without too many issues - modulo cctools/ld64 backports.

Of course, this is (mostly) hobby-stuff, and we must take steps to ensure that it does not impact on the mainstream work - but I think we're mostly achieving that.


cheers
Iain


> 
> -Nick
> 
> 
>> 
>> ** this assumes that it's a reasonable constraint to require the developer to use the latest official XCode release for a given platform.
>> 
>>> for 10.6 and later:  Have no .eh labels and no relocations on __eh_frame
>>> 
>>> -Nick
>>> 
>>> On Jul 17, 2014, at 2:23 PM, Steven Wu <stevenwu at apple.com> wrote:
>>>> Hi Rafael
>>>> 
>>>> I don’t know if you are expecting this, but this commit breaks the old x86_64 OS X systems (10.5 and below) which need __eh_frame. It will failed to compile any programs for old OS X (with “-mmacosx-version-min” flag). I attached two simple test cases which give slightly different error messages:
>>>> 
>>>> $ clang -arch x86_64 -Os -mmacosx-version-min=10.4 test_case.c -o test.o
>>>> ld: sectionForAddress(0xFFFFFFFFFF91969F) address not in any section file '/var/folders/cg/pyj86ks15y74w3hkwhm_zltm0000gp/T/test_case-cd1eb6.o' for architecture x86_64
>>>> clang-3.5: error: linker command failed with exit code 1 (use -v to see invocation)
>>>> $ clang -arch x86_64 -Os -mmacosx-version-min=10.4 test_case2.c -o test
>>>> ld: symbol index out of range file '/var/folders/cg/pyj86ks15y74w3hkwhm_zltm0000gp/T/test_case2-9a07c6.o' for architecture x86_64
>>>> clang-3.5: error: linker command failed with exit code 1 (use -v to see invocation)
>>>> 
>>>> The reason of the failure is that the OS X linker cannot recognize the generated relocations in __eh_frame. The current clang will create relocation section looks like:
>>>> Relocation information (__TEXT,__eh_frame) 4 entries
>>>> address  pcrel length extern type    scattered symbolnum/value
>>>> 00000048 False quad   False  SUB     False     2 (__TEXT,__eh_frame)
>>>> 00000048 False quad   True   UNSIGND False     _main
>>>> 00000020 False quad   False  SUB     False     2 (__TEXT,__eh_frame)
>>>> 00000020 False quad   True   UNSIGND False     _foo
>>>> But the linker is expecting either a pair of SUB/UNSIGNED relocations in eh_frame (for old systems) or no relocations (for new systems). A fix might just be remove all the relocations in eh_frame. I CCed Nick in this thread, so Nick correct me if I am wrong.
>>>> 
>>>> Thanks
>>>> 
>>>> Steven
>>>> 
>>>> <test_case.c>
>>>> <test_case2.c>
>>>> 
>>>>> Author: rafael
>>>>> Date: Mon Jun 23 10:13:23 2014
>>>>> New Revision: 211500
>>>>> 
>>>>> URL: 
>>>>> http://llvm.org/viewvc/llvm-project?rev=211500&view=rev
>>>>> 
>>>>> Log:
>>>>> Stop producing func.eh symbols on Darwin.
>>>>> 
>>>>> According Nick Kledzik (
>>>>> http://llvm.org/bugs/show_bug.cgi?id=19430#c2
>>>>> ):
>>>>> "... mach-o no longer needs names in the __eh_frame section (and has not for
>>>>> years)."
>>>>> 
>>>>> Iain Sandoe confirms it is also unnecessary for their old darwin support.
>>>>> 
>>>>> Removed:
>>>>>   llvm/trunk/test/MC/MachO/eh-symbols.s
>>>>> Modified:
>>>>>   llvm/trunk/include/llvm/MC/MCObjectFileInfo.h
>>>>>   llvm/trunk/lib/MC/MCDwarf.cpp
>>>>>   llvm/trunk/lib/MC/MCObjectFileInfo.cpp
>>>>>   llvm/trunk/test/MC/MachO/eh-frame-reloc.s
>>>>> 
>>>>> Modified: llvm/trunk/include/llvm/MC/MCObjectFileInfo.h
>>>>> 
>>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 





More information about the llvm-commits mailing list