<br /><br /><span>On 05/30/16 01:34 PM, <b class="name">Rafael Espíndola </b> <rafael.espindola@gmail.com> wrote:</span><blockquote cite="mid:CAG3jReLwT1R24JNhPBmsN_Rk4Mvoh056sDT9uJ0cNx6C6zWbvw@mail.gmail.com" class="iwcQuote" style="border-left: 1px solid #00F; padding-left: 13px; margin-left: 0;" type="cite"><div class="mimetype-text-plain">We don't use cl::opt in gold, instead we parse the -plugin-opts that<br />gold passes the plugin (see process_plugin_option).<br /></div></blockquote><div>What about that:</div><div><br /></div><div><p style="margin-top: 0px; margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Menlo;"><span style="font-variant-ligatures: no-common-ligatures">$ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp</span></p>
<p style="margin-top: 0px; margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Menlo;"><span style="font-variant-ligatures: no-common-ligatures"> // </span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720"><b>ParseCommandLineOptions</b></span><span style="font-variant-ligatures: no-common-ligatures">() expects argv[0] to be program name. Lazily</span></p>
<p style="margin-top: 0px; margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Menlo;"><span style="font-variant-ligatures: no-common-ligatures"> cl::</span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720"><b>ParseCommandLineOptions</b></span><span style="font-variant-ligatures: no-common-ligatures">(NumOpts, &options::extra[0]);</span></p></div><div><span style="font-variant-ligatures: no-common-ligatures"><br /></span></div><div><br /></div><div>-- </div><div>Mehdi</div><blockquote cite="mid:CAG3jReLwT1R24JNhPBmsN_Rk4Mvoh056sDT9uJ0cNx6C6zWbvw@mail.gmail.com" class="iwcQuote" style="border-left: 1px solid #00F; padding-left: 13px; margin-left: 0;" type="cite"><div class="mimetype-text-plain"><br />On 30 May 2016 at 02:13, Mehdi Amini <mehdi.amini@apple.com> wrote:<br />><br />> On May 29, 2016, at 5:44 PM, Shi, Steven <steven.shi@intel.com> wrote:<br />><br />> (And I doubt the GNU linker supports LTO with LLVM).<br />> [Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see<br />> detail in this bug <a href="https://sourceware.org/bugzilla/show_bug.cgi?id=20070." target="l">https://sourceware.org/bugzilla/show_bug.cgi?id=20070.</a><br />> The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in<br />> my side. And from the ld owner input in the bug comments, the current X64<br />> LLVM LTO issue is in llvm LTO plugin.<br />><br />><br />> The fact that we don't support it for now seems to indicate that it is not a<br />> widely requested feature, especially considering that it is really a trivial<br />> option to add.<br />> What is the linker you're using? Are you building your own clang?<br />> [Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker.<br />> I can build my own clang in my side if needed. I’m happy to know it is not<br />> difficult to enable the large code model in LLVM LTO and “it is really a<br />> trivial option to add”. Could you let me know how to enable it? My lots of<br />> work have been blocked by the large code model issue. Thank you!<br />><br />><br />><br />> I can't test it locally, but here is a starting point in the gold plugin,<br />> inspired by the code present in clang:<br />><br />><br />><br />> You need to use your linker-specific way of passing the option<br />> "-lto-use-large-codemodel=..." to the plugin.<br />><br />> Let me know if it works for you!<br />><br />> --<br />> Mehdi<br />><br />><br />><br />><br />> Steven Shi<br />> Intel\SSG\STO\UEFI Firmware<br />><br />> Tel: +86 021-61166522<br />> iNet: 821-6522<br />><br />> From: mehdi.amini@apple.com [<a href="mailto:mehdi.amini@apple.com]">mailto:mehdi.amini@apple.com]</a><br />> Sent: Monday, May 30, 2016 8:17 AM<br />> To: Shi, Steven <steven.shi@intel.com><br />> Cc: Umesh Kalappa <umesh.kalappa0@gmail.com>; eliben@gmail.com; llvm-dev<br />> <llvm-dev@lists.llvm.org>; cfe-dev@lists.llvm.org; Rafael Espíndola<br />> <rafael.espindola@gmail.com><br />> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?<br />><br />><br />><br />> On May 29, 2016, at 5:10 PM, Shi, Steven <steven.shi@intel.com> wrote:<br />><br />> Hi Mehdi,<br />> GCC LTO seems support large code model in my side as below, if the code<br />> model is linker specific, does the GCC LTO use a special linker which is<br />> different from the one in GNU Binutils?<br />><br />><br />> I don't know anything about GCC.<br />> (And I doubt the GNU linker supports LTO with LLVM).<br />><br />><br />> I’m a bit surprised if both OS X ld64 and gold plugin do not support large<br />> code model in LTO. Since modern system widely use the 64bit, the code need<br />> to run in high address (larger than 2 GB) is a reasonable requirement.<br />><br />><br />> The fact that we don't support it for now seems to indicate that it is not a<br />> widely requested feature, especially considering that it is really a trivial<br />> option to add.<br />> What is the linker you're using? Are you building your own clang?<br />><br />> --<br />> Mehdi<br />><br />><br />><br />><br />><br />> $ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o<br />> codemodel1_large_lto_gcc.bin<br />> $ objdump -dS codemodel1_large_lto_gcc.bin<br />><br />> int main(int argc, const char* argv[])<br />> {<br />> 40048b: 55 push %rbp<br />> 40048c: 48 89 e5 mov %rsp,%rbp<br />> 40048f: 48 83 ec 20 sub $0x20,%rsp<br />> 400493: 89 7d ec mov %edi,-0x14(%rbp)<br />> 400496: 48 89 75 e0 mov %rsi,-0x20(%rbp)<br />> int t = global_func(argc);<br />> 40049a: 8b 45 ec mov -0x14(%rbp),%eax<br />> 40049d: 89 c7 mov %eax,%edi<br />> 40049f: 48 b8 76 04 40 00 00 movabs $0x400476,%rax<br />> 4004a6: 00 00 00<br />> 4004a9: ff d0 callq *%rax<br />> 4004ab: 89 45 fc mov %eax,-0x4(%rbp)<br />> t += global_arr[7];<br />> 4004ae: 48 b8 20 09 60 00 00 movabs $0x600920,%rax<br />> 4004b5: 00 00 00<br />> 4004b8: 8b 40 1c mov 0x1c(%rax),%eax<br />> 4004bb: 01 45 fc add %eax,-0x4(%rbp)<br />> t += static_arr[7];<br />> 4004be: 48 b8 c0 0a 60 00 00 movabs $0x600ac0,%rax<br />> 4004c5: 00 00 00<br />> 4004c8: 8b 40 1c mov 0x1c(%rax),%eax<br />> 4004cb: 01 45 fc add %eax,-0x4(%rbp)<br />> t += global_arr_big[7];<br />> 4004ce: 48 b8 60 0c 60 00 00 movabs $0x600c60,%rax<br />> 4004d5: 00 00 00<br />> 4004d8: 8b 40 1c mov 0x1c(%rax),%eax<br />> 4004db: 01 45 fc add %eax,-0x4(%rbp)<br />> t += static_arr_big[7];<br />> 4004de: 48 b8 a0 19 63 00 00 movabs $0x6319a0,%rax<br />> 4004e5: 00 00 00<br />> 4004e8: 8b 40 1c mov 0x1c(%rax),%eax<br />> 4004eb: 01 45 fc add %eax,-0x4(%rbp)<br />> return t;<br />> 4004ee: 8b 45 fc mov -0x4(%rbp),%eax<br />> }<br />><br />> Steven Shi<br />> Intel\SSG\STO\UEFI Firmware<br />><br />> Tel: +86 021-61166522<br />> iNet: 821-6522<br />><br />> From: mehdi.amini@apple.com [<a href="mailto:mehdi.amini@apple.com]">mailto:mehdi.amini@apple.com]</a><br />> Sent: Monday, May 30, 2016 4:28 AM<br />> To: Shi, Steven <steven.shi@intel.com><br />> Cc: Umesh Kalappa <umesh.kalappa0@gmail.com>; eliben@gmail.com; llvm-dev<br />> <llvm-dev@lists.llvm.org>; cfe-dev@lists.llvm.org; Rafael Espíndola<br />> <rafael.espindola@gmail.com><br />> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?<br />><br />> Hi,<br />><br />><br />><br />> On May 29, 2016, at 7:36 AM, Shi, Steven <steven.shi@intel.com> wrote:<br />><br />> Hi Mehdi,<br />> After deeper debug, I found my firmware LTO wrong code issue is related to<br />> X64 code model (-mcmodel=large) is always overridden as small<br />> (-mcmodel=small) if LTO build. And I don't know how to correctly specific<br />> the large code model for my X64 firmware LTO build. Appreciate if you could<br />> let me know it.<br />><br />> You know, parts of my Uefi firmware (BIOS) have to been loaded to run in<br />> high address (larger than 2 GB) at the very beginning, and I need the code<br />> makes absolutely no assumptions about the addresses and data sections. But<br />> current LLVM LTO seems stick to use the small code model and generate many<br />> code with 32-bit RIP-relative addressing, which cause CPU exceptions when<br />> run in address larger than 2GB.<br />><br />> Below, I just simply reuse the Eli's codemodel1.c example (link:<br />> <a href="http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models)" target="l">http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models)</a><br />> to show the LLVM LTO code model issue.<br />> $ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin<br />> $ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin<br />> $ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin<br />> $ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin<br />><br />> You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are<br />> exactly the same!<br />> And if you disassemble the codemodel1_large_lto.bin, you will see it uses<br />> the small code model (32-bit RIP-relative), not large, to do addressing as<br />> below.<br />><br />> $ objdump -dS codemodel1_large_lto.bin<br />><br />> int main(int argc, const char* argv[])<br />> {<br />> 4004f0: 55 push %rbp<br />> 4004f1: 48 89 e5 mov %rsp,%rbp<br />> 4004f4: 48 83 ec 20 sub $0x20,%rsp<br />> 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)<br />> 4004ff: 89 7d f8 mov %edi,-0x8(%rbp)<br />> 400502: 48 89 75 f0 mov %rsi,-0x10(%rbp)<br />> int t = global_func(argc);<br />> 400506: 8b 7d f8 mov -0x8(%rbp),%edi<br />> 400509: e8 d2 ff ff ff callq 4004e0 <global_func><br />> 40050e: 89 45 ec mov %eax,-0x14(%rbp)<br />> t += global_arr[7];<br />> 400511: 8b 04 25 4c 10 60 00 mov 0x60104c,%eax<br />> 400518: 03 45 ec add -0x14(%rbp),%eax<br />> 40051b: 89 45 ec mov %eax,-0x14(%rbp)<br />> t += static_arr[7];<br />> 40051e: 8b 04 25 dc 11 60 00 mov 0x6011dc,%eax<br />> 400525: 03 45 ec add -0x14(%rbp),%eax<br />> 400528: 89 45 ec mov %eax,-0x14(%rbp)<br />> t += global_arr_big[7];<br />> 40052b: 8b 04 25 6c 13 60 00 mov 0x60136c,%eax<br />> 400532: 03 45 ec add -0x14(%rbp),%eax<br />> 400535: 89 45 ec mov %eax,-0x14(%rbp)<br />> t += static_arr_big[7];<br />> 400538: 8b 04 25 ac 20 63 00 mov 0x6320ac,%eax<br />> 40053f: 03 45 ec add -0x14(%rbp),%eax<br />> 400542: 89 45 ec mov %eax,-0x14(%rbp)<br />> return t;<br />> 400545: 8b 45 ec mov -0x14(%rbp),%eax<br />> 400548: 48 83 c4 20 add $0x20,%rsp<br />> 40054c: 5d pop %rbp<br />> 40054d: c3 retq<br />> 40054e: 66 90 xchg %ax,%ax<br />><br />><br />> So, does LTO support large code model? How to correctly specify the LTO code<br />> model option?<br />><br />><br />> Same answer as before: LTO is setup by the linker, so the option for that,<br />> if it exists, will be linker specific.<br />><br />> As far as I can tell, neither libLTO-based linker (ld64 on OS X for<br />> example), neither the gold plugin supports such an option and the code model<br />> is always "default".<br />><br />> I don't know about lld, CC Rafael about that.<br />><br />> --<br />> Mehdi<br />><br />><br />><br />><br />><br />><br />><br />><br />> Steven Shi<br />> Intel\SSG\STO\UEFI Firmware<br />><br />> Tel: +86 021-61166522<br />> iNet: 821-6522<br />><br />>> -----Original Message-----<br />>> From: mehdi.amini@apple.com [<a href="mailto:mehdi.amini@apple.com]">mailto:mehdi.amini@apple.com]</a><br />>> Sent: Wednesday, May 18, 2016 4:02 AM<br />>> To: Umesh Kalappa <umesh.kalappa0@gmail.com><br />>> Cc: Shi, Steven <steven.shi@intel.com>; llvm-dev<br />>> <llvm-dev@lists.llvm.org>;<br />>> cfe-dev@lists.llvm.org<br />>> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?<br />>><br />>><br />>> > On May 17, 2016, at 11:21 AM, Umesh Kalappa<br />>> <umesh.kalappa0@gmail.com> wrote:<br />>> ><br />>> > Steven,<br />>> ><br />>> > As mehdi stated , the optimisation level is specific to linker and it<br />>> > enables Inter-Pro opts passes ,please refer function<br />>><br />>> To be very clear: the -O option may trigger *linker* optimizations as<br />>> well,<br />>> independently of LTO.<br />>><br />>> --<br />>> Mehdi<br />>><br />>><br />>><br />><br />><br />><br /></div></blockquote>