[llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?

Mon May 30 13:56:13 PDT 2016

On 05/30/16 01:34 PM, Rafael Espíndola  <rafael.espindola at gmail.com> wrote: 
> 
> We don't use cl::opt in gold, instead we parse the -plugin-opts that
> gold passes the plugin (see process_plugin_option).
> 

What about that:

$ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp 

 //  ParseCommandLineOptions () expects argv[0] to be program name. Lazily 

 cl:: ParseCommandLineOptions (NumOpts, &options::extra[0]); 

-- 
Mehdi

> 
> 
> On 30 May 2016 at 02:13, Mehdi Amini <mehdi.amini at apple.com> wrote:
> >
> > On May 29, 2016, at 5:44 PM, Shi, Steven <steven.shi at intel.com> wrote:
> >
> > (And I doubt the GNU linker supports LTO with LLVM).
> > [Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see
> > detail in this bug https://sourceware.org/bugzilla/show_bug.cgi?id=20070.
> > The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in
> > my side. And from the ld owner input in the bug comments, the current X64
> > LLVM LTO issue is in llvm LTO plugin.
> >
> >
> > The fact that we don't support it for now seems to indicate that it is not a
> > widely requested feature, especially considering that it is really a trivial
> > option to add.
> > What is the linker you're using? Are you building your own clang?
> > [Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker.
> > I can build my own clang in my side if needed. I’m happy to know it is not
> > difficult to enable the large code model in LLVM LTO and “it is really a
> > trivial option to add”. Could you let me know how to enable it? My lots of
> > work have been blocked by the large code model issue. Thank you!
> >
> >
> >
> > I can't test it locally, but here is a starting point in the gold plugin,
> > inspired by the code present in clang:
> >
> >
> >
> > You need to use your linker-specific way of passing the option
> > "-lto-use-large-codemodel=..." to the plugin.
> >
> > Let me know if it works for you!
> >
> > --
> > Mehdi
> >
> >
> >
> >
> > Steven Shi
> > Intel\SSG\STO\UEFI Firmware
> >
> > Tel: +86 021-61166522
> > iNet: 821-6522
> >
> > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] <mehdi.amini at apple.com]>
> > Sent: Monday, May 30, 2016 8:17 AM
> > To: Shi, Steven <steven.shi at intel.com>
> > Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com>; eliben at gmail.com; llvm-dev
> > <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org; Rafael Espíndola
> > <rafael.espindola at gmail.com>
> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
> >
> >
> >
> > On May 29, 2016, at 5:10 PM, Shi, Steven <steven.shi at intel.com> wrote:
> >
> > Hi Mehdi,
> > GCC LTO seems support large code model in my side as below, if the code
> > model is linker specific, does the GCC LTO use a special linker which is
> > different from the one in GNU Binutils?
> >
> >
> > I don't know anything about GCC.
> > (And I doubt the GNU linker supports LTO with LLVM).
> >
> >
> > I’m a bit surprised if both OS X ld64 and gold plugin do not support large
> > code model in LTO. Since modern system widely use the 64bit, the code need
> > to run in high address (larger than 2 GB) is a reasonable requirement.
> >
> >
> > The fact that we don't support it for now seems to indicate that it is not a
> > widely requested feature, especially considering that it is really a trivial
> > option to add.
> > What is the linker you're using? Are you building your own clang?
> >
> > --
> > Mehdi
> >
> >
> >
> >
> >
> > $ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o
> > codemodel1_large_lto_gcc.bin
> > $ objdump -dS codemodel1_large_lto_gcc.bin
> >
> > int main(int argc, const char* argv[])
> > {
> > 40048b: 55 push %rbp
> > 40048c: 48 89 e5 mov %rsp,%rbp
> > 40048f: 48 83 ec 20 sub $0x20,%rsp
> > 400493: 89 7d ec mov %edi,-0x14(%rbp)
> > 400496: 48 89 75 e0 mov %rsi,-0x20(%rbp)
> > int t = global_func(argc);
> > 40049a: 8b 45 ec mov -0x14(%rbp),%eax
> > 40049d: 89 c7 mov %eax,%edi
> > 40049f: 48 b8 76 04 40 00 00 movabs $0x400476,%rax
> > 4004a6: 00 00 00
> > 4004a9: ff d0 callq *%rax
> > 4004ab: 89 45 fc mov %eax,-0x4(%rbp)
> > t += global_arr[7];
> > 4004ae: 48 b8 20 09 60 00 00 movabs $0x600920,%rax
> > 4004b5: 00 00 00
> > 4004b8: 8b 40 1c mov 0x1c(%rax),%eax
> > 4004bb: 01 45 fc add %eax,-0x4(%rbp)
> > t += static_arr[7];
> > 4004be: 48 b8 c0 0a 60 00 00 movabs $0x600ac0,%rax
> > 4004c5: 00 00 00
> > 4004c8: 8b 40 1c mov 0x1c(%rax),%eax
> > 4004cb: 01 45 fc add %eax,-0x4(%rbp)
> > t += global_arr_big[7];
> > 4004ce: 48 b8 60 0c 60 00 00 movabs $0x600c60,%rax
> > 4004d5: 00 00 00
> > 4004d8: 8b 40 1c mov 0x1c(%rax),%eax
> > 4004db: 01 45 fc add %eax,-0x4(%rbp)
> > t += static_arr_big[7];
> > 4004de: 48 b8 a0 19 63 00 00 movabs $0x6319a0,%rax
> > 4004e5: 00 00 00
> > 4004e8: 8b 40 1c mov 0x1c(%rax),%eax
> > 4004eb: 01 45 fc add %eax,-0x4(%rbp)
> > return t;
> > 4004ee: 8b 45 fc mov -0x4(%rbp),%eax
> > }
> >
> > Steven Shi
> > Intel\SSG\STO\UEFI Firmware
> >
> > Tel: +86 021-61166522
> > iNet: 821-6522
> >
> > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] <mehdi.amini at apple.com]>
> > Sent: Monday, May 30, 2016 4:28 AM
> > To: Shi, Steven <steven.shi at intel.com>
> > Cc: Umesh Kalappa <umesh.kalappa0 at gmail.com>; eliben at gmail.com; llvm-dev
> > <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org; Rafael Espíndola
> > <rafael.espindola at gmail.com>
> > Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
> >
> > Hi,
> >
> >
> >
> > On May 29, 2016, at 7:36 AM, Shi, Steven <steven.shi at intel.com> wrote:
> >
> > Hi Mehdi,
> > After deeper debug, I found my firmware LTO wrong code issue is related to
> > X64 code model (-mcmodel=large) is always overridden as small
> > (-mcmodel=small) if LTO build. And I don't know how to correctly specific
> > the large code model for my X64 firmware LTO build. Appreciate if you could
> > let me know it.
> >
> > You know, parts of my Uefi firmware (BIOS) have to been loaded to run in
> > high address (larger than 2 GB) at the very beginning, and I need the code
> > makes absolutely no assumptions about the addresses and data sections. But
> > current LLVM LTO seems stick to use the small code model and generate many
> > code with 32-bit RIP-relative addressing, which cause CPU exceptions when
> > run in address larger than 2GB.
> >
> > Below, I just simply reuse the Eli's codemodel1.c example (link:
> > http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models)
> > to show the LLVM LTO code model issue.
> > $ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin
> > $ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin
> > $ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin
> > $ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin
> >
> > You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are
> > exactly the same!
> > And if you disassemble the codemodel1_large_lto.bin, you will see it uses
> > the small code model (32-bit RIP-relative), not large, to do addressing as
> > below.
> >
> > $ objdump -dS codemodel1_large_lto.bin
> >
> > int main(int argc, const char* argv[])
> > {
> > 4004f0: 55 push %rbp
> > 4004f1: 48 89 e5 mov %rsp,%rbp
> > 4004f4: 48 83 ec 20 sub $0x20,%rsp
> > 4004f8: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
> > 4004ff: 89 7d f8 mov %edi,-0x8(%rbp)
> > 400502: 48 89 75 f0 mov %rsi,-0x10(%rbp)
> > int t = global_func(argc);
> > 400506: 8b 7d f8 mov -0x8(%rbp),%edi
> > 400509: e8 d2 ff ff ff callq 4004e0 <global_func>
> > 40050e: 89 45 ec mov %eax,-0x14(%rbp)
> > t += global_arr[7];
> > 400511: 8b 04 25 4c 10 60 00 mov 0x60104c,%eax
> > 400518: 03 45 ec add -0x14(%rbp),%eax
> > 40051b: 89 45 ec mov %eax,-0x14(%rbp)
> > t += static_arr[7];
> > 40051e: 8b 04 25 dc 11 60 00 mov 0x6011dc,%eax
> > 400525: 03 45 ec add -0x14(%rbp),%eax
> > 400528: 89 45 ec mov %eax,-0x14(%rbp)
> > t += global_arr_big[7];
> > 40052b: 8b 04 25 6c 13 60 00 mov 0x60136c,%eax
> > 400532: 03 45 ec add -0x14(%rbp),%eax
> > 400535: 89 45 ec mov %eax,-0x14(%rbp)
> > t += static_arr_big[7];
> > 400538: 8b 04 25 ac 20 63 00 mov 0x6320ac,%eax
> > 40053f: 03 45 ec add -0x14(%rbp),%eax
> > 400542: 89 45 ec mov %eax,-0x14(%rbp)
> > return t;
> > 400545: 8b 45 ec mov -0x14(%rbp),%eax
> > 400548: 48 83 c4 20 add $0x20,%rsp
> > 40054c: 5d pop %rbp
> > 40054d: c3 retq
> > 40054e: 66 90 xchg %ax,%ax
> >
> >
> > So, does LTO support large code model? How to correctly specify the LTO code
> > model option?
> >
> >
> > Same answer as before: LTO is setup by the linker, so the option for that,
> > if it exists, will be linker specific.
> >
> > As far as I can tell, neither libLTO-based linker (ld64 on OS X for
> > example), neither the gold plugin supports such an option and the code model
> > is always "default".
> >
> > I don't know about lld, CC Rafael about that.
> >
> > --
> > Mehdi
> >
> >
> >
> >
> >
> >
> >
> >
> > Steven Shi
> > Intel\SSG\STO\UEFI Firmware
> >
> > Tel: +86 021-61166522
> > iNet: 821-6522
> >
> >> -----Original Message-----
> >> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] <mehdi.amini at apple.com]>
> >> Sent: Wednesday, May 18, 2016 4:02 AM
> >> To: Umesh Kalappa <umesh.kalappa0 at gmail.com>
> >> Cc: Shi, Steven <steven.shi at intel.com>; llvm-dev
> >> <llvm-dev at lists.llvm.org>;
> >> cfe-dev at lists.llvm.org
> >> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
> >>
> >>
> >> > On May 17, 2016, at 11:21 AM, Umesh Kalappa
> >> <umesh.kalappa0 at gmail.com> wrote:
> >> >
> >> > Steven,
> >> >
> >> > As mehdi stated , the optimisation level is specific to linker and it
> >> > enables Inter-Pro opts passes ,please refer function
> >>
> >> To be very clear: the -O option may trigger *linker* optimizations as
> >> well,
> >> independently of LTO.
> >>
> >> --
> >> Mehdi
> >>
> >>
> >>
> >
> >
> >
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160530/b924833a/attachment.html>