[llvm] r229841 - Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.

Kuperstein, Michael M michael.m.kuperstein at intel.com
Mon May 18 22:48:14 PDT 2015


Thanks a lot, Daniel!

It looks like this is also the problem for self-host, at least in some configurations.
I've got this to reproduce when self-host, with a debug clang trying to build a release clang.
Compiling X86MCTargetDesc.cpp (well, really, running opt -O2) took over 15 minutes, on a fairly high-end x86 machine. If I use a release clang, it takes about 30 seconds overall.

X86 has more than 10K instruction descriptions, all (?) of them with a FeatureBitset initialized to an empty set. So, 10K constructor calls, each of them eventually equivalent to zeroing a single long (or, supposedly, once this actually gets > 64 bits, a bit more). Each constructor call gets inlined, and the whole thing replaced with a memset, but, at least for the debug compiler, doing it >~10K times takes a while. According to time-passes, about 5 minutes were spent inlining, and 10 more were spent on dead-store elimination.

I guess some versions of gcc don't handle this gracefully in release builds either...
One potential solution is to mark the FeatureBitset constructor noinline, but I'm afraid that may not be so nice w.r.t clang load time. Any other ideas?
(I'll also try to figure out *why* this is so slow with clang, but that won't help for gcc.) 

Michael

-----Original Message-----
From: Daniel Sanders [mailto:Daniel.Sanders at imgtec.com] 
Sent: Monday, May 18, 2015 19:32
To: Kuperstein, Michael M; Renato Golin
Cc: Richard Barton; llvm-commits at cs.uiuc.edu
Subject: RE: [llvm] r229841 - Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.

I've trimmed the failing case for Mips and it seems that std::bitset with initializer_lists is dramatically slower and more memory hungry than the integers were, even if the initializer_list constructor does nothing.

The attached testcases are compiled with:
	/usr/bin/time -v g++ --std=c++11 -c -O0 $test_file

For Mips (g++ Debian 4.9.2-10), I get:
	test-stdbitset.c: 12.19s user time, 122MB Max RSS size
	test-ulonglong.c: 0.19s user time, 11MB Max RSS size Switching -O0 to -O2 gives:
	test-stdbitset.c: 376.56s user time, 175MB Max RSS size
	test-ulonglong.c: 0.23s user time, 11MB Max RSS size

I've also tried this on x86_64 (g++ Debian 4.9.2-10). The results have similar proportions:
	test-stdbitset.c: 1.78s user time, 160MB Max RSS size
	test-ulonglong.c: 0.04s user time, 22MB Max RSS size Switching -O0 to -O2 gives:
	test-stdbitset.c: 25.67s user time, 145MB Max RSS size
	test-ulonglong.c: 0.04s user time, 22MB Max RSS size

I'd expect initializer_lists to be a bit bigger/slower but these differences seem very large. Looking at the assembly produced for MIPS, the initialization seems to be very simple. It just calls the std::bitset constructor with 'this' set to the address of each element and relies on the optimizer to inline the constructors, eliminate common sub-expressions, etc. For very large arrays such as X86Insts, this optimization task is unreasonably large.

> -----Original Message-----
> From: Daniel Sanders
> Sent: 18 May 2015 15:00
> To: 'Kuperstein, Michael M'; Renato Golin
> Cc: Richard Barton; llvm-commits at cs.uiuc.edu
> Subject: RE: [llvm] r229841 - Reverting r229831 due to multiple 
> ARM/PPC/MIPS build-bot failures.
> 
> It defaults to empty for me.
> 
> For the Mips failure, it seems to be spending a very long time 
> processing the 'extern const MCInstrDesc X86Insts[] = ...' 
> declaration. If I delete it or reduce it to a single entry, the file compiles quickly.
> 
> > -----Original Message-----
> > From: Kuperstein, Michael M [mailto:michael.m.kuperstein at intel.com]
> > Sent: 18 May 2015 14:48
> > To: Renato Golin; Daniel Sanders
> > Cc: Richard Barton; llvm-commits at cs.uiuc.edu
> > Subject: RE: [llvm] r229841 - Reverting r229831 due to multiple 
> > ARM/PPC/MIPS build-bot failures.
> >
> > I think the default CMAKE_BUILD_TYPE is Debug.
> >
> > -----Original Message-----
> > From: Renato Golin [mailto:renato.golin at linaro.org]
> > Sent: Monday, May 18, 2015 16:27
> > To: Daniel Sanders
> > Cc: Kuperstein, Michael M; Richard Barton; llvm-commits at cs.uiuc.edu
> > Subject: Re: [llvm] r229841 - Reverting r229831 due to multiple 
> > ARM/PPC/MIPS build-bot failures.
> >
> > On 18 May 2015 at 14:17, Daniel Sanders <Daniel.Sanders at imgtec.com>
> > wrote:
> > > I can't reproduce it from a cmake+ninja build with 'CMAKE_BUILD_TYPE='
> > build but I can from a 'CMAKE_BUILD_TYPE=Release' build.
> >
> > I never used  'CMAKE_BUILD_TYPE=', only 'CMAKE_BUILD_TYPE=Release'
> or
> > 'CMAKE_BUILD_TYPE=Debug'.
> >
> >
> > > When 'CMAKE_BUILD_TYPE=', the command to build
> >
> lib/Target/X86/MCTargetDesc/CMakeFiles/LLVMX86Desc.dir/X86MCTargetD
> > esc.cpp.o takes around 50s and ~500MB RAM.
> > > When 'CMAKE_BUILD_TYPE=Release', it takes ~30mins and the memory
> > usage very slowly creeps upwards until the process is killed at 1GB.
> >
> > Is this x86_64?
> >
> > cheers,
> > --renato
> > --------------------------------------------------------------------
> > -
> > Intel Israel (74) Limited
> >
> > This e-mail and any attachments may contain confidential material 
> > for the sole use of the intended recipient(s). Any review or 
> > distribution by others is strictly prohibited. If you are not the 
> > intended recipient, please contact the sender and delete all copies.
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.




More information about the llvm-commits mailing list