[PATCH] D10991: [LNT] Reduce I/O execution time for Polybench
Kristof Beyls
kristof.beyls at arm.com
Tue Jul 7 10:32:17 PDT 2015
================
Comment at: SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c:46
@@ +45,3 @@
+ for (j = 0; j < n; j++)
+ print_element(A[i][j], j*8, printmat);
+ fputs(printmat, stderr);
----------------
rengolin wrote:
> kristof.beyls wrote:
> > rengolin wrote:
> > > kristof.beyls wrote:
> > > > Do I understand correctly that this code basically only prints out the values of the last row of the entire matrix (the offset is j*8)? I think we'd want whatever the hash function implementation we end up with to still take all elements as input, to improve the chance of detecting a mis-compilation.
> > > > I think the hash function can be really simple - no need for anything complex or secure; but we probably should feed in all matrix elements into the hash function. Maybe the straightforward solution here is to just print out the sum of all elements in a row, rather than each element in the row?
> > > >
> > > > Tobias may now these tests better: are we expecting bit-reproducible results for these tests? I'm guessing so unless DATA_PRINTF_MODIFIER in the original code was chosen so that it prints out with less precision?
> > > >
> > > >
> > > No.
> > >
> > > print_element receives a float value (4 bytes) and expand into 8 nibles (8 bytes). So, every iteration of the print of A[i][j] will be on printmat[j*8]. In there, j*8 is only the initial position, on a streak of 8, not the *only* position printed.
> > Yes, I got that, but given that this is a 2-dimensional matrix, with i indicating the row and j indicating the column, only using j to index the printed out result means that every iteration of the i-loop overwrites the results in printmat written on the previous iteration, right? The mallox(n*8) also indicates there is only room to print a single row, not the entire matrix. Or maybe I'm still missing something?
> That's why the fputs below is inside the i loop. I'm printing one row at a time. This also saves a lot of memory and avoids trashing the allocators, helps caching, etc.
>
> Since the runtime now is indistinguishable from when *not* printing anything, I think it's a good trade-off.
D'oh - I missed that.
Cool, so we're still producing roughly the same amount of output - but way more efficiently.
Provided these tests were already checking for bit-accurate results (I'm not sure - it probably depends on the DATA_PRINTF_MODIFIER), this looks good to me.
Repository:
rL LLVM
http://reviews.llvm.org/D10991
More information about the llvm-commits
mailing list