[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Tue Mar 7 11:29:40 PST 2017


I mean, yes, it would be great if MachO were a better object file format.
But it has a limited number of sections, so .subsections_via_symbols and
atoms work around that problem. The only way to do better in the long term
is to use a better object file format, either new or existing. Every so
often somebody takes a look at that, but so far Apple has continued to
extend MachO as necessary to support their use cases.

On Tue, Mar 7, 2017 at 11:23 AM, James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> My point was more that it'd be great if symbols and sections and dead-code
> stripping, in general, would work more like they do elsewhere. ISTM that'd
> make things cleaner generally. No special case for offset aliases would be
> needed, nor for features like prefix data. They'd just work by virtue of
> generic handling for llvm sections turning into the appropriate symbol
> flags. (It's a shame that defining the start of a subsection wasn't made
> opt-IN for a symbol in macho, but oh well.)
>
> Now, one of the *consequences* of hooking up the infrastructure that was
> is that -fno-{function,data}-sections would automatically work. But I was
> thinking of that independently of whether a user might want to actually
> specify that.
>
> However -- there is actually another possible reason to disable it: when
> it's disabled, you avoid inserting relocations for references to local data
> and functions. That can sometimes allow the use of shorter instructions to
> reference nearby data -- so, in fact, it's possible for
> -f{function,data}-sections to make your FINAL binary larger than it would
> be without.
>
>
> On Tue, Mar 7, 2017 at 1:12 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>>
>> On Mar 6, 2017, at 7:56 PM, James Y Knight via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Oh, that's great that it's possible to implement properly, now. Does it
>> actually work for
>>
>> It'd be cool if LLVM hooked up its generic section handling support to
>> this feature now, so that the only global symbols that *didn't* get marked
>> as .alt_entry were those at the beginning of what llvm would consider
>> sections.
>>
>> Then apple platforms could behave sanely, like all other platforms do,
>> only with -f{function,data}-sections defaulted to on instead of off.
>>
>>
>> What is the advantage of not using -f{function,data}-sections? (i.e. what
>> isn’t sane about it?)
>>
>> I’m asking because I was told the only reason not to use it all the time
>> on ELF is that it makes intermediate object files larger.
>>
>> Thanks,
>>
>>>> Mehdi
>>
>>
>>
>>
>>
>> (That is, if you specify -fno-function-sections -fno-data-sections, it
>> could mark nearly *everything* as .alt_entry -- except the first symbol in
>> the object file)
>>
>>
>> On Mon, Mar 6, 2017 at 2:35 PM, Peter Collingbourne <peter at pcc.me.uk>
>> wrote:
>>
>>> That is in theory what omitting the .subsections_via_symbols directive
>>> is supposed to do, but in an experiment I ran a year or two ago I found
>>> that the Mach-O linker was still dead stripping on symbol boundaries with
>>> this directive omitted.
>>>
>>> In any case, a more precise approach has more recently (~a few months
>>> ago) become possible. There is a relatively new asm directive called
>>> .altentry that, as I understand it, tells the linker to disregard a given
>>> symbol as a section boundary (LLVM already uses this for aliases pointing
>>> into the middle of a global). So what you would do is to use .altentry on
>>> the function symbol, with an internal symbol appearing before the prefix
>>> data to ensure that it is not considered part of the body of the previous
>>> function.
>>>
>>> Peter
>>>
>>> On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> AFAIK, this cannot actually work on Apple platforms, because its object
>>>> file format (Mach-O) doesn't use sections to determine the ranges of
>>>> code/data to keep together, but instead _infers_ boundaries based on the
>>>> range between global symbols in the symbol table.
>>>>
>>>> So, the symbol pointing to the beginning of @main *necessarily* makes
>>>> that be a section boundary.
>>>>
>>>> I think the best that could be done in LLVM is to not emit the
>>>> ".subsections_via_symbols" asm directive (effectively disabling dead
>>>> stripping on that object) if any prefix data exists. Currently it emits
>>>> that flag unconditionally for MachO.
>>>>
>>>> On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I just came across a rather annoying behavior with llvm 3.9. Assuming
>>>>> the following
>>>>> samle code in test.ll:
>>>>>
>>>>> ; Lets have some global int x = 4
>>>>> @x = global i32 10, align 4
>>>>> ; and two strings "p = %d\n" for the prefix data,
>>>>> ; as well as "x = %d\n" to print the (global) x value.
>>>>> @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", align 1
>>>>> @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align
>>>>> 1
>>>>>
>>>>> ; declare printf, we'll use this later for printf style debugging.
>>>>> declare i32 @printf(i8*, ...)
>>>>>
>>>>> ; define a main function.
>>>>> define i32 @main() prefix i32 123 {
>>>>>   ; obtain a i32 pointer to the main function.
>>>>>   ; the prefix data is right before that pointer.
>>>>>   %main = bitcast i32 ()* @main to i32*
>>>>>
>>>>>   ; use the gep, to cmpute the start of the prefix data.
>>>>>   %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1
>>>>>   ; and load it.
>>>>>   %prefix_val = load i32, i32* %prefix_ptr
>>>>>
>>>>>   ; print that value.
>>>>>   %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x
>>>>> i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val)
>>>>>
>>>>>   ; similarly let's do the same with the global x.
>>>>>   %1 = alloca i32, align 4
>>>>>   store i32 0, i32* %1, align 4
>>>>>   %2 = load i32, i32* @x, align 4
>>>>>   %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x
>>>>> i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2)
>>>>>   ret i32 0
>>>>> }
>>>>>
>>>>> gives the following result (expected)
>>>>>
>>>>>    $ clang test.ll
>>>>>    $ ./a.out
>>>>>    p = 123
>>>>>    x = 10
>>>>>
>>>>> however, with -dead_strip on macOS, we see the following:
>>>>>
>>>>>    $ clang test.ll -dead_strip
>>>>>    $ ./a.out
>>>>>    p = 0
>>>>>    x = 10
>>>>>
>>>>> Thus I believe we are incorrectly stripping prefix data when linking
>>>>> with -dead_strip on macOS.
>>>>>
>>>>> As I do not have a bugzilla account, and hence cannot post this as a
>>>>> proper bug report.
>>>>>
>>>>> Cheers,
>>>>>  Moritz
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>>
>>> --
>>> --
>>> Peter
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170307/36210770/attachment.html>


More information about the llvm-dev mailing list