[PATCH] D24593: Standford/Bubble sort code restructure

Wed Sep 14 20:07:52 PDT 2016

mehdi_amini added inline comments.

================
Comment at: SingleSource/Benchmarks/Stanford/Bubblesort.c:159
@@ -156,1 +158,3 @@
+			sortlist[i] = sli;
+			sortlist[i + 1] = sli1;
 			i=i+1;
----------------
evstupac wrote:
> evstupac wrote:
> > mehdi_amini wrote:
> > > MatzeB wrote:
> > > > evstupac wrote:
> > > > > The flakiness caused by unpredictable memory accesses to array and code on short distance:
> > > > > 
> > > > > Loop:
> > > > > 
> > > > > ```
> > > > > if (a[i] > a[i + 1) {// load a[i], a[i+1];
> > > > > //store to a[i], a[i+1];
> > > > > }
> > > > > ```
> > > > > 
> > > > > Making stores unconditional will simplify memory accesses and potentially CFG.
> > > > > This will open the test for compiler optimizations (like unroll, scalar replacement, if conversion...).
> > > > > Generally the idea to convert the test from memory test to compiler test.
> > > > > This will open the test for compiler optimizations (like unroll, scalar replacement, if conversion...).
> > > > Generally the idea to convert the test from memory test to compiler test.
> > > > 
> > > > We should not change any of the existing benchmarks. It is the Stanford benchmark as is, if you change it we will have a sudden change change in performance in the LNT database, and we can no longer claim to be compiling the "Stanford" benchmark (it's the same reason that we cannot modify SPEC to be better suited for the compiler).
> > > > 
> > > > If anything you should add a new benchmark with the structure you desire.
> > > > 
> > > It seems to me that you're creating a "new" benchmark by doing that.
> > > It does not seem legit to change a benchmark because you find it hard to analyze from within the compiler.
> > > 
> > > (That does not fit my definition of flaky either)
> > > 
> > > Please define "flakiness".
> > 
> > Currently the code contains both branch mispredictions and memory stalls Sometimes they compensate each other, sometimes not. That depends on array/code addresses.  In https://reviews.llvm.org/D18158 depending on what unroll technique we use (epilog/prolog) both addresses changed. That causes performance difference up to 2 times with the identical hot loop - which is out of compiler control.
> > we can no longer claim to be compiling the "Stanford" benchmark (it's the same reason that we cannot modify SPEC to be better suited for the compiler
> 
> That is true.
> However SPEC benchmark is something different. There are regular updates. The newest are checked to avoid any kind of flakiness on latest architectures.
> 
> Stanford benchmarks are old and there are no updates (correct me if I'm wrong, but even timing of these benchmarks is ~0.01 sec or even less). We are not checking spec92 performance on latest architectures, but still checking Stanford benchmarks.
> That way - yes we are unable to influence on the benchmarks somehow.
> But maybe there is a chance to switch to renewed Stanford benchmarks in LLVM testing?
> 
> Where we can get newer Stanford benchmarks? Who is the owner to whom I can address my concerns?
> 
Can you provide more (detailed) explanations (or specific pointers) to an analysis that shows how this loop is sensitive to transformations in an unpredictable way for the compiler? This seems quite hand-wavy right now to me.



https://reviews.llvm.org/D24593