[PATCH][lld][PECOFF] Add Support for entry point symbol name

Rui Ueyama ruiu at google.com
Mon Aug 26 12:55:40 PDT 2013


I'll submit this patch, assuming that the following patch will implement
more accurate entry point finding logic.


On Mon, Aug 26, 2013 at 11:09 AM, Rui Ueyama <ruiu at google.com> wrote:

> On Fri, Aug 23, 2013 at 2:06 PM, Jesús Serrano <dagonoweda at gmail.com>wrote:
>
>>  Thanks for the clarification, I've seen that this is the correct way to
>> do this. I only have a doubt about it. If none of the entry point symbols
>> (_main, _wMain, etc) are defined in the input files, the resolve phase will
>> not fail, and the writer would have the responsibility of throw the
>> corresponding link error, could be this acceptable?
>>
>
> I think that is acceptable, because it's the earliest reasonable point to
> detect the error. In general there is a case that only the writer can print
> warnings or errors. For example, if the size parameter given by /base is
> too small, we need to warn about that, which can only be done in the writer.
>
>
>> On 23/08/2013 21:46, Nick Kledzik wrote:
>>
>> On Aug 23, 2013, at 10:42 AM, Jesús Serrano <dagonoweda at gmail.com> wrote:
>>
>>  It's perfectly ok to me that you submit the patch just as it is, as you
>> said, it is useful.
>>
>> In the other hand, I've been thinking how to implement the final link
>> behavior. The search of the entry point symbol must be done obviously after
>> the input reading. I think that also after the resolve phase, since the "
>> _main" symbol could be defined in a shared library. One possible solution
>> would be to search inside a new Pass before the writing. What do you
>> think?
>>
>> A pass is too heavy weight, and passes cannot lookup atoms by name…
>>
>>  The way to do this is like how the mach-o and ELF writers find _main.
>>  The PECOFF Writer should implement addFiles() to add a file to the link
>> which has a zero size atom which has references to UndefinedAtoms for each
>> of those potential entry points (_main, _wWinMain, etc).  These
>> UndefinedAtoms have a canBeNull() of canBeNullAtBuildTime, so that the
>> resolver will not error if they are not found.  Then when the PECOFF Writer
>> is told to write the file, it looks back at the magic atom it added and
>> look to see which of the possible entry points were found and uses the
>> highest priority one.
>>
>>  -Nick
>>
>>
>>
>>
>> On 23/08/2013 1:19, Rui Ueyama wrote:
>>
>> Thank you for the investigation. This is more complex than I thought and
>> very interesting. This behavior needs to be implemented for link.exe
>> compatibility. It doesn't make sense to implement this in the driver as you
>> wrote and should be in the linking context.
>>
>>  I want submit your patch for now, as it's useful. You'll be able to
>> remove the code from the driver and add new code to the linking context.
>> Does this sound good?
>>
>> On Tue, Aug 20, 2013 at 1:07 PM, Jesús Serrano <dagonoweda at gmail.com>
>> wrote:
>>
>>>
>>> I've attached a new patch with this changes. Now the windows driver sets
>>>> the entry symbol name according to the corresponding options. However, this
>>>> won't work in all cases. Depending onwhether the crt is compiled as
>>>> multibyte or unicode, the entry point symbol name changes from
>>>> "mainCRTStartup" to "wmainCRTStartup", and the same for the windows
>>>> subsystem. The problem with this is that the use of unicode is unknown at
>>>> link time (or at least I don't know how to retrieve this option from object
>>>> files). One possible solution would be to have two entry point names in the
>>>> linking context, and find a symbol mathing any of them, but I don't like it
>>>> so much. Any ideas on this?
>>>>
>>>
>>> I've looking into this, and seems that LINK.EXE behavior for executable
>>> images is to "select" the CRT entry point depending on the symbols defined
>>> in the object files.
>>>  If neither the subsystem nor the entry point are defined in command
>>> line arguments, the linker searches for the symbols "_main", "_wmain",
>>> "_WinMain" and "_wWinMain". Depending on the found symbols, the linker
>>> chooses the subsystem and the CRT entry point, having "_main" the highest
>>> priority and "_wWinMain" the lesser. This is, if "_main" is defined, the
>>> subsystem will be CONSOLE and the CRT entry point "_mainCRTStartup", and so
>>> on. If several of these entry point symbols for the same subsystem are
>>> defined, the linker chooses the symbol of higher priority and prints a
>>> warning.
>>> If the subsystem is defined in the command line, then the linker
>>> searches only for the corresponding entry point symbols to choose between
>>> the unicode and the multibyte CRT entry point.
>>> If the entry point symbol is defined in the command line, but not the
>>> subsystem, the linker forces to choose one and throws an error.
>>>
>>> If we want to follow this behavior, it has no sense to let the driver to
>>> choose the entry point symbol name. This could be handled by the linking
>>> context after the file parsing or the by writer, what do you think about
>>> that?
>>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130826/f0b0d4e1/attachment.html>


More information about the llvm-commits mailing list