[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Tue Sep 9 13:59:54 PDT 2014

> On Sep 9, 2014, at 1:47 PM, Sean Silva <chisophugis at gmail.com> wrote:
> 
> 
> 
> On Tue, Sep 9, 2014 at 12:53 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
> Hi Chandler,
> 
> I had observed some improvements and regressions with the new lowering.
> 
> Here are the numbers for an Ivy Bridge machine fixed at 2900MHz.
> 
> I’ll look into the regressions to provide test cases.
> 
> ** Numbers **
> 
> Smaller is better. Only reported tests that run for at least one second.
> Reference is the default lowering, Test is the new lowering.
> The Os numbers are overall neutral, but the O3 numbers mainly expose regressions.
> 
> Note: I can attach the raw numbers if you want.
> 
> That would be great. Please do.

Alright, here they are :).

base-perf-Ox.txt: runtime for the default lowering.
new-perf-Ox.txt: runtime for the new lowering.

Each line in those files has the following format:
<unit> <benchmark> <perf number>

The units are:
- min: Minimum of the 7 runs.
- max: Maximum of the 7 runs.
- avg: Average of the 7 runs.
- total: Total of the 7 runs.
- med: Median of the 7 runs.
- SD: Standard deviation of the 7 runs.
- SD%: Standard deviation of the7  runs in percentage.

-Quentin

> 
> -- Sean Silva
>  
> 
> * Os *
> Benchmark_ID    	Reference	Test    	Expansion 	Percent
> -------------------------------------------------------------------------------
> External/Nurbs/nurbs                   	       2.3302	       2.3122	    0.99	    -1%
> External/SPEC/CFP2000/183.equake/183.eq	       3.2606	       3.2419	    0.99	    -1%
> External/SPEC/CFP2006/447.dealII/447.de <http://447.de/>	      16.4638	      16.1313	    0.98	    -2%
> External/SPEC/CFP2006/470.lbm/470.lbm  	       2.0159	       1.9931	    0.99	    -1%
> External/SPEC/CINT2000/164.gzip/164.gzi	       8.7611	       8.6981	    0.99	    -1%
> External/SPEC/CINT2006/456.hmmer/456.hm <http://456.hm/>	       2.5674	       2.5819	    1.01	    +1%
> External/SPEC/CINT2006/462.libquantum/4	       1.2924	        1.347	    1.04	    +4%
> MultiSource/Benchmarks/TSVC/CrossingThr	       2.4703	       2.4852	    1.01	    +1%
> MultiSource/Benchmarks/TSVC/LoopRerolli	       2.6611	       2.5668	    0.96	    -4%
> MultiSource/Benchmarks/mafft/pairlocala	       24.676	      24.5372	    0.99	    -1%
> SingleSource/Benchmarks/Adobe-C++/simpl	       1.0579	       1.1048	    1.04	    +4%
> SingleSource/Benchmarks/Linpack/linpack	       4.2817	       4.3298	    1.01	    +1%
> SingleSource/Benchmarks/Misc-C++/stepan	       4.1821	        4.226	    1.01	    +1%
> SingleSource/Benchmarks/Misc/oourafft  	       3.0305	       3.1777	    1.05	    +5%
> -------------------------------------------------------------------------------
> Min (14)                               	            -	            -	    0.96	      -
> -------------------------------------------------------------------------------
> Max (14)                               	            -	            -	    1.05	      -
> -------------------------------------------------------------------------------
> Sum (14)                               	           79	           79	       1	    +0%
> -------------------------------------------------------------------------------
> A.Mean (14)                            	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> G.Mean 2 (14)                          	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> 
> * O3 *
> Benchmark_ID    	Reference	Test    	Expansion 	Percent
> -------------------------------------------------------------------------------
> External/Nurbs/nurbs                   	       2.2322	       2.2131	    0.99	    -1%
> External/Povray/povray                 	       2.2638	       2.2762	    1.01	    +1%
> External/SPEC/CFP2000/177.mesa/177.mesa	       1.6675	       1.6828	    1.01	    +1%
> External/SPEC/CFP2000/188.ammp/188.ammp	      10.9309	      11.1191	    1.02	    +2%
> External/SPEC/CFP2006/433.milc/433.milc	       6.9214	       7.1696	    1.04	    +4%
> External/SPEC/CINT2000/164.gzip/164.gzi	       8.5327	       8.8114	    1.03	    +3%
> External/SPEC/CINT2000/186.crafty/186.c	       4.1266	         4.16	    1.01	    +1%
> External/SPEC/CINT2000/253.perlbmk/253.	       5.6991	       5.7309	    1.01	    +1%
> External/SPEC/CINT2000/256.bzip2/256.bz <http://256.bz/>	       6.7917	       6.8763	    1.01	    +1%
> External/SPEC/CINT2006/400.perlbench/40	        6.243	       6.1464	    0.98	    -2%
> External/SPEC/CINT2006/401.bzip2/401.bz <http://401.bz/>	        2.095	       2.0588	    0.98	    -2%
> External/SPEC/CINT2006/462.libquantum/4	          1.2	       1.2108	    1.01	    +1%
> MultiSource/Applications/SIBsim4/SIBsim	       2.4547	       2.5129	    1.02	    +2%
> MultiSource/Benchmarks/Bullet/bullet   	       4.1687	       4.0882	    0.98	    -2%
> MultiSource/Benchmarks/TSVC/LinearDepen	       3.0389	       3.0566	    1.01	    +1%
> MultiSource/Benchmarks/TSVC/LinearDepen	       2.1298	       2.1997	    1.03	    +3%
> MultiSource/Benchmarks/TSVC/LoopRerolli	       2.6458	       2.5552	    0.97	    -3%
> MultiSource/Benchmarks/TSVC/Symbolics-f	       1.6243	       1.6612	    1.02	    +2%
> MultiSource/Benchmarks/mafft/pairlocala	      23.8979	      24.0547	    1.01	    +1%
> SingleSource/Benchmarks/Misc/oourafft  	       3.0374	       3.1846	    1.05	    +5%
> SingleSource/Benchmarks/SmallPT/smallpt	       6.5533	       6.6683	    1.02	    +2%
> -------------------------------------------------------------------------------
> Min (21)                               	            -	            -	    0.97	      -
> -------------------------------------------------------------------------------
> Max (21)                               	            -	            -	    1.05	      -
> -------------------------------------------------------------------------------
> Sum (21)                               	          108	          109	    1.01	    -1%
> -------------------------------------------------------------------------------
> A.Mean (21)                            	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> G.Mean 2 (21)                          	            -	            -	    1.01	    +1%
> -------------------------------------------------------------------------------
> 
> Thanks,
> -Quentin
> 
>> On Sep 9, 2014, at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com <mailto:andrea.dibiagio at gmail.com>> wrote:
>> 
>> Hi Chandler,
>> 
>> Thanks for fixing the problem with the insertps mask.
>> 
>> Generally the new shuffle lowering looks promising, however there are
>> some cases where the codegen is now worse causing runtime performance
>> regressions in some of our internal codebase.
>> 
>> You have already mentioned how the new shuffle lowering is missing
>> some features; for example, you explicitly said that we currently lack
>> of SSE4.1 blend support. Unfortunately, this seems to be one of the
>> main reasons for the slowdown we are seeing.
>> 
>> Here is a list of what we found so far that we think is causing most
>> of the slowdown:
>> 1) shufps is always emitted in cases where we could emit a single
>> blendps; in these cases, blendps is preferable because it has better
>> reciprocal throughput (this is true on all modern Intel and AMD cpus).
>> 
>> Things get worse when it comes to lowering shuffles where the shuffle
>> mask indices refer to elements from both input vectors in each lane.
>> For example, a shuffle mask of <0,5,2,7> could be easily lowered into
>> a single blendps; instead it gets lowered into two shufps
>> instructions.
>> 
>> Example:
>> ;;;
>> define <4 x float> @foo(<4 x float> %A, <4 x float> %B) {
>>  %1 = shufflevector <4 x float> %A, <4 x float> %B, <4 x i32> <i32 0,
>> i32 5, i32 2, i32 7>
>>  ret <4 x float> %1
>> }
>> ;;;
>> 
>> llc (-mcpu=corei7-avx):
>>  vblendps  $10, %xmm1, %xmm0, %xmm0   # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7]
>> 
>> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
>>  vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3]
>>  vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3]
>> 
>> 
>> 2) On SSE4.1, we should try not to emit an insertps if the shuffle
>> mask identifies a blend. At the moment the new lowering logic is very
>> aggressively emitting insertps instead of cheaper blendps.
>> 
>> Example:
>> ;;;
>> define <4 x float> @bar(<4 x float> %A, <4 x float> %B) {
>>  %1 = shufflevector <4 x float> %A, <4 x float> %B, <4 x i32> <i32 4,
>> i32 5, i32 2, i32 7>
>>  ret <4 x float> %1
>> }
>> ;;;
>> 
>> llc (-mcpu=corei7-avx):
>>  vblendps  $11, %xmm0, %xmm1, %xmm0   # xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
>> 
>> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
>>  vinsertps $-96, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1],xmm1[2],xmm0[3]
>> 
>> 
>> 3) When a shuffle performs an insert at index 0 we always generate an
>> insertps, while a movss would do a better job.
>> ;;;
>> define <4 x float> @baz(<4 x float> %A, <4 x float> %B) {
>>  %1 = shufflevector <4 x float> %A, <4 x float> %B, <4 x i32> <i32 4,
>> i32 1, i32 2, i32 3>
>>  ret <4 x float> %1
>> }
>> ;;;
>> 
>> llc (-mcpu=corei7-avx):
>>  vmovss %xmm1, %xmm0, %xmm0
>> 
>> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
>>  vinsertps $0, %xmm1, %xmm0, %xmm0 # xmm0 = xmm1[0],xmm0[1,2,3]
>> 
>> I hope this is useful. We would be happy to contribute patches to
>> improve some of the above cases, but we obviously know that this is
>> still a work in progress, so we don't want to introduce conflicts with
>> your work. Please let us know what you think.
>> 
>> We will keep looking at this and follow up with any further findings.
>> 
>> Thanks,
>> Andrea Di Biagio
>> SN Systems - Sony Computer Entertainment Inc.
>> 
>> On Mon, Sep 8, 2014 at 6:08 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>>> Hi Chandler,
>>> 
>>> Forget about that I said.
>>> It seems I have some weird dependencies in my built system.
>>> My binaries are out-of-sync.
>>> 
>>> Let me sort that out, this is likely the problem is already fixed, and I can
>>> resume the measurements.
>>> 
>>> Sorry for the noise.
>>> 
>>> Q.
>>> 
>>> On Sep 8, 2014, at 9:32 AM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>>> 
>>> 
>>> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>>> 
>>> Sure,
>>> 
>>> Here is the command line:
>>> clang -cc1 -triple x86_64-apple-macosx -S -disable-free
>>> -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic
>>> -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu
>>> core-avx-i  -O3  -ferror-limit 19 -fmessage-length 114 -stack-protector 1
>>> -mstackrealign -fblocks  -fencode-extended-block-signature
>>> -fmax-type-align=16 -fdiagnostics-show-option -fcolor-diagnostics
>>> -vectorize-loops -vectorize-slp -mllvm
>>> -x86-experimental-vector-shuffle-lowering=true -o tmp.s -x cpp-output tmp.i
>>> 
>>> This was with trunk 215249.
>>> 
>>> I meant, r217281.
>>> 
>>> 
>>> Thanks,
>>> -Quentin
>>> 
>>> <tmp.i>
>>> 
>>> On Sep 6, 2014, at 4:27 PM, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com>> wrote:
>>> 
>>> I've run the SingleSource test suite for core-avx-i and have no failures
>>> here so a preprocessed file + commandline would be very useful if this
>>> reproduces for you still.
>>> 
>>> On Sat, Sep 6, 2014 at 4:07 PM, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com>>
>>> wrote:
>>>> 
>>>> I'm having trouble reproducing this. I'm trying to get LNT to actually
>>>> run, but manually compiling the given source file didn't reproduce it for
>>>> me.
>>>> 
>>>> It might have been fixed recently (although I'd be surprised if so), but
>>>> it would help to get the actual command line for which compiling this file
>>>> in the test suite failed.
>>>> 
>>>> -Chandler
>>>> 
>>>> On Fri, Sep 5, 2014 at 4:36 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>>
>>>> wrote:
>>>>> 
>>>>> Hi Chandler,
>>>>> 
>>>>> While doing the performance measurement on a Ivy Bridge, I ran into
>>>>> compile time errors.
>>>>> 
>>>>> I saw a bunch of “cannot select" in the LLVM test suite with
>>>>> -march=core-avx-i.
>>>>> E.g., SingleSource/UnitTests/Vector/SSE/sse.isamax.c is failing at O3
>>>>> -march=core-avx-i with:
>>>>> fatal error: error in backend: Cannot select: 0x7f91b99a6420: v4i32 =
>>>>> bitcast 0x7f91b99b0e10 [ORD=3] [ID=27]
>>>>>  0x7f91b99b0e10: v4i64 = insert_subvector 0x7f91b99a7210,
>>>>> 0x7f91b99a6d68, 0x7f91b99ace70 [ORD=2] [ID=25]
>>>>>    0x7f91b99a7210: v4i64 = undef [ID=15]
>>>>>    0x7f91b99a6d68: v2i64 = scalar_to_vector 0x7f91b99ab840 [ORD=2]
>>>>> [ID=23]
>>>>>      0x7f91b99ab840: i64 = AssertZext 0x7f91b99acc60, 0x7f91b99ac738
>>>>> [ORD=2] [ID=20]
>>>>>        0x7f91b99acc60: i64,ch = CopyFromReg 0x7f91b8d52820,
>>>>> 0x7f91b99a3a10 [ORD=2] [ID=16]
>>>>>          0x7f91b99a3a10: i64 = Register %vreg68 [ID=1]
>>>>>    0x7f91b99ace70: i64 = Constant<0> [ID=3]
>>>>> In function: isamax0
>>>>> clang: error: clang frontend command failed with exit code 70 (use -v to
>>>>> see invocation)
>>>>> clang version 3.6.0 (215249)
>>>>> Target: x86_64-apple-darwin14.0.0
>>>>> 
>>>>> For some reason, I cannot reproduce the problem with the test case that
>>>>> clang gives me using -emit-llvm. Since the source is public, I guess you can
>>>>> try to reproduce on your side.
>>>>> Indeed, if you run the test-suite with -march=core-avx-i you’ll likely
>>>>> see all those failures.
>>>>> 
>>>>> Let me know if you cannot and I’ll try harder to produce a test case.
>>>>> 
>>>>> Note: This is the same failure all over the place, i.e., cannot select a
>>>>> bit cast from various types to v4i32 or v4i64.
>>>>> 
>>>>> Thanks,
>>>>> -Quentin
>>>>> 
>>>>> 
>>>>> On Sep 5, 2014, at 11:09 AM, Robert Lougher <rob.lougher@
>>>>> 
>>>>> gmail.com <http://gmail.com/>> wrote:
>>>>> 
>>>>> Hi Chandler,
>>>>> 
>>>>> On 5 September 2014 17:38, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com>> wrote:
>>>>> 
>>>>> 
>>>>> On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com <mailto:rob.lougher at gmail.com>>
>>>>> wrote:
>>>>> 
>>>>> 
>>>>> Unfortunately, another team, while doing internal testing has seen the
>>>>> new path generating illegal insertps masks.  A sample here:
>>>>> 
>>>>>   vinsertps    $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>>>>>   vinsertps    $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>>>>>   vinsertps    $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>>>>>   vinsertps    $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>>>>> xmm4[0,1],xmm1[2],xmm4[3]
>>>>>   vinsertps    $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>>>>> xmm6[0,1],xmm13[2],xmm6[3]
>>>>>   vinsertps    $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>>>>> xmm7[0,1],xmm0[2],xmm7[3]
>>>>> 
>>>>> We'll continue to look into this and do additional testing.
>>>>> 
>>>>> 
>>>>> 
>>>>> Interesting. Let me know if you get a test case. The insertps code path
>>>>> was
>>>>> added recently though and has been much less well tested. I'll start fuzz
>>>>> testing it and should hopefully uncover the bug.
>>>>> 
>>>>> 
>>>>> Here's two small test cases.  Hope they are of use.
>>>>> 
>>>>> Thanks,
>>>>> Rob.
>>>>> 
>>>>> ------
>>>>> define <4 x float> @test(<4 x float> %xyzw, <4 x float> %abcd) {
>>>>> %1 = extractelement <4 x float> %xyzw, i32 0
>>>>> %2 = insertelement <4 x float> undef, float %1, i32 0
>>>>> %3 = insertelement <4 x float> %2, float 0.000000e+00, i32 1
>>>>> %4 = shufflevector <4 x float> %3, <4 x float> %xyzw, <4 x i32> <i32
>>>>> 0, i32 1, i32 6, i32 undef>
>>>>> %5 = shufflevector <4 x float> %4, <4 x float> %abcd, <4 x i32> <i32
>>>>> 0, i32 1, i32 2, i32 4>
>>>>> ret <4 x float> %5
>>>>> }
>>>>> 
>>>>> define <4 x float> @test2(<4 x float> %xyzw, <4 x float> %abcd) {
>>>>> %1 = shufflevector <4 x float> %xyzw, <4 x float> %abcd, <4 x i32>
>>>>> <i32 0, i32 undef, i32 2, i32 4>
>>>>> %2 = shufflevector <4 x float> <float undef, float 0.000000e+00,
>>>>> float undef, float undef>, <4 x float> %1, <4 x i32> <i32 4, i32 1,
>>>>> i32 6, i32 7>
>>>>> ret <4 x float> %2
>>>>> }
>>>>> 
>>>>> 
>>>>> llc -march=x86-64 -mattr=+avx test.ll -o -
>>>>> 
>>>>> test:                                   # @test
>>>>>   vxorps    %xmm2, %xmm2, %xmm2
>>>>>   vmovss    %xmm0, %xmm2, %xmm2
>>>>>   vblendps    $4, %xmm0, %xmm2, %xmm0 # xmm0 = xmm2[0,1],xmm0[2],xmm2[3]
>>>>>   vinsertps    $48, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>>>>>   retl
>>>>> 
>>>>> test2:                                  # @test2
>>>>>   vinsertps    $48, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>>>>>   vxorps    %xmm1, %xmm1, %xmm1
>>>>>   vblendps    $13, %xmm0, %xmm1, %xmm0 # xmm0 =
>>>>> xmm0[0],xmm1[1],xmm0[2,3]
>>>>>   retl
>>>>> 
>>>>> llc -march=x86-64 -mattr=+avx
>>>>> -x86-experimental-vector-shuffle-lowering test.ll -o -
>>>>> 
>>>>> test:                                   # @test
>>>>>   vinsertps    $270, %xmm0, %xmm0, %xmm2 # xmm2 = xmm0[0],zero,zero,zero
>>>>>   vinsertps    $416, %xmm0, %xmm2, %xmm0 # xmm0 =
>>>>> xmm2[0,1],xmm0[2],xmm2[3]
>>>>>   vinsertps    $304, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>>>>>   retl
>>>>> 
>>>>> test2:                                  # @test2
>>>>>   vinsertps    $304, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0,1,2],xmm1[0]
>>>>>   vxorps    %xmm1, %xmm1, %xmm1
>>>>>   vinsertps    $336, %xmm1, %xmm0, %xmm0 # xmm0 =
>>>>> xmm0[0],xmm1[1],xmm0[2,3]
>>>>>   retl
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>>>> 
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: base-perf-O3.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: base-perf-Os.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0001.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0002.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: new-perf-O3.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0002.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0003.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: new-perf-Os.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0003.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140909/268479be/attachment-0004.html>