[PATCH][lld][PECOFF] Add Support for entry point symbol name

Rui Ueyama ruiu at google.com
Mon Aug 26 11:09:08 PDT 2013


On Fri, Aug 23, 2013 at 2:06 PM, Jesús Serrano <dagonoweda at gmail.com> wrote:

>  Thanks for the clarification, I've seen that this is the correct way to
> do this. I only have a doubt about it. If none of the entry point symbols
> (_main, _wMain, etc) are defined in the input files, the resolve phase will
> not fail, and the writer would have the responsibility of throw the
> corresponding link error, could be this acceptable?
>

I think that is acceptable, because it's the earliest reasonable point to
detect the error. In general there is a case that only the writer can print
warnings or errors. For example, if the size parameter given by /base is
too small, we need to warn about that, which can only be done in the writer.


> On 23/08/2013 21:46, Nick Kledzik wrote:
>
> On Aug 23, 2013, at 10:42 AM, Jesús Serrano <dagonoweda at gmail.com> wrote:
>
>  It's perfectly ok to me that you submit the patch just as it is, as you
> said, it is useful.
>
> In the other hand, I've been thinking how to implement the final link
> behavior. The search of the entry point symbol must be done obviously after
> the input reading. I think that also after the resolve phase, since the "
> _main" symbol could be defined in a shared library. One possible solution
> would be to search inside a new Pass before the writing. What do you think
> ?
>
> A pass is too heavy weight, and passes cannot lookup atoms by name…
>
>  The way to do this is like how the mach-o and ELF writers find _main.
>  The PECOFF Writer should implement addFiles() to add a file to the link
> which has a zero size atom which has references to UndefinedAtoms for each
> of those potential entry points (_main, _wWinMain, etc).  These
> UndefinedAtoms have a canBeNull() of canBeNullAtBuildTime, so that the
> resolver will not error if they are not found.  Then when the PECOFF Writer
> is told to write the file, it looks back at the magic atom it added and
> look to see which of the possible entry points were found and uses the
> highest priority one.
>
>  -Nick
>
>
>
>
> On 23/08/2013 1:19, Rui Ueyama wrote:
>
> Thank you for the investigation. This is more complex than I thought and
> very interesting. This behavior needs to be implemented for link.exe
> compatibility. It doesn't make sense to implement this in the driver as you
> wrote and should be in the linking context.
>
>  I want submit your patch for now, as it's useful. You'll be able to
> remove the code from the driver and add new code to the linking context.
> Does this sound good?
>
> On Tue, Aug 20, 2013 at 1:07 PM, Jesús Serrano <dagonoweda at gmail.com>
> wrote:
>
>>
>> I've attached a new patch with this changes. Now the windows driver sets
>>> the entry symbol name according to the corresponding options. However, this
>>> won't work in all cases. Depending onwhether the crt is compiled as
>>> multibyte or unicode, the entry point symbol name changes from
>>> "mainCRTStartup" to "wmainCRTStartup", and the same for the windows
>>> subsystem. The problem with this is that the use of unicode is unknown at
>>> link time (or at least I don't know how to retrieve this option from object
>>> files). One possible solution would be to have two entry point names in the
>>> linking context, and find a symbol mathing any of them, but I don't like it
>>> so much. Any ideas on this?
>>>
>>
>> I've looking into this, and seems that LINK.EXE behavior for executable
>> images is to "select" the CRT entry point depending on the symbols defined
>> in the object files.
>>  If neither the subsystem nor the entry point are defined in command line
>> arguments, the linker searches for the symbols "_main", "_wmain",
>> "_WinMain" and "_wWinMain". Depending on the found symbols, the linker
>> chooses the subsystem and the CRT entry point, having "_main" the highest
>> priority and "_wWinMain" the lesser. This is, if "_main" is defined, the
>> subsystem will be CONSOLE and the CRT entry point "_mainCRTStartup", and so
>> on. If several of these entry point symbols for the same subsystem are
>> defined, the linker chooses the symbol of higher priority and prints a
>> warning.
>> If the subsystem is defined in the command line, then the linker searches
>> only for the corresponding entry point symbols to choose between the
>> unicode and the multibyte CRT entry point.
>> If the entry point symbol is defined in the command line, but not the
>> subsystem, the linker forces to choose one and throws an error.
>>
>> If we want to follow this behavior, it has no sense to let the driver to
>> choose the entry point symbol name. This could be handled by the linking
>> context after the file parsing or the by writer, what do you think about
>> that?
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130826/ba28984b/attachment.html>


More information about the llvm-commits mailing list