[LLVMdev] TSVC/Equivalencing-dbl

Sat Oct 6 21:44:10 PDT 2012

Shivaram,

Thanks! I'm double-checking on the way in which the arrays are initialized; I'll follow-up in the next day or so.

 -Hal

----- Original Message -----
> From: "Shivarama Rao" <Shivarama.Rao at amd.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Duncan Sands" <duncan.sands at gmail.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Saturday, October 6, 2012 11:04:20 PM
> Subject: RE: [LLVMdev] TSVC/Equivalencing-dbl
> 
> Hi Hal,
> 
> To get my understanding right, is this a test-case problem or there
> is a problem with x86 code generation?. I can spend some time to
> look into the problem.
> 
> Thanks,
> Shivaram
> 
> 
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu
> [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
> Sent: Saturday, October 06, 2012 1:57 AM
> To: Duncan Sands
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] TSVC/Equivalencing-dbl
> 
> 
> 
> ----- Original Message -----
> > From: "Duncan Sands" <duncan.sands at gmail.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvmdev at cs.uiuc.edu
> > Sent: Friday, October 5, 2012 2:50:06 PM
> > Subject: Re: TSVC/Equivalencing-dbl
> > 
> > Hi Hal,
> > 
> > On 05/10/12 20:32, Hal Finkel wrote:
> > > ----- Original Message -----
> > >> From: "Duncan Sands" <duncan.sands at gmail.com>
> > >> To: "Hal Finkel" <hfinkel at anl.gov>
> > >> Cc: llvmdev at cs.uiuc.edu
> > >> Sent: Friday, October 5, 2012 12:10:03 PM
> > >> Subject: Re: TSVC/Equivalencing-dbl
> > >>
> > >> Oops, I ran the testsuite wrong: read clang output for dragonegg
> > >> output.
> > >
> > > Okay, can you resummarize? Do you mean that?
> > >
> > > gcc -O0:
> > > S1421         0.00                 16000
> > >
> > > gcc -O0 under valgrind:
> > > S1421         0.00                 17208.404325315
> > >
> > > clang:
> > > S1421    0.00           17208.404325315
> > 
> > exactly.  For "clang" this is only when building like the testsuite
> > does
> > (i.e. with link-time optimization + llc): if you directly do:
> >    clang tsc.c dummy.c -std=gnu99 -O3
> > then you get 16000.
> > 
> > >
> > > This is all on Darwin, right?
> > 
> > No, this is on x86-64 (ubuntu) linux.
> 
> OIC, interesting!
> 
> > 
> > >
> > > I would certainly tend to suspect an 80-bit-intermediate issue,
> > > but, both gcc and clang give 16000 on PowerPC (which has no
> > > 80-bit).
> > 
> > Not sure what you are saying here.  The issue is the x86 internally
> > uses 80 bits
> > for the 64 bit (double) type, so as long as everything is in
> > registers you get
> > lots more precision, but the moment you store to memory only 64
> > bits
> > are stored.
> > The fact that gcc and clang give the same on powerpc confirms that
> > it
> > is coming
> > from x86 using an extra 16 bits of precision beyond what you would
> > expect.
> > 
> >   It could be a rounding issue, but would Darwin really have a
> >   different default
> > rounding mode?
> > 
> > As I'm seeing this on linux, I guess not :)
> > 
> > >
> > > The computation being performed here is [in s1421() in tsc.inc]:
> > >                  for (int i = 0; i < LEN/2; i++) {
> > >                          b[i] = xx[i] + a[i];
> > >                  }
> > 
> > 
> > > So *if* we're adding up the same numbers in the same order, the
> > > answer should be the same everywhere ;)
> > 
> > No, why would it be the same everywhere?  If the whole thing is
> > done
> > in
> > double registers, and x86 processor will maintain 80 bits of
> > precision
> > even though these are 64 bit (double) types, while if things are
> > loaded
> > and stored to memory at every step instead then only 64 bits will
> > be
> > used.
> > This can lead to very different results.
> 
> Right.
> 
> > 
> >   Can you put in some print statements and confirm?
> > 
> > Not sure what you want me to confirm, but anyway I now have 1/2 an
> > hour to
> > look into this some more :)
> 
> For test s1421, we have:
>                 for (int i = 0; i < LEN/2; i++) {
>                         b[i] = xx[i] + a[i];
>                 }
> 
> in this case xx is set to the second half of the b array. a is
> initialized to 1/(i+1)^2. The b array, however, does not seem to be
> explicitly initialized for this test. When all of the tests are run
> in order, it is initialized for the last test in the previous group,
> s353... so maybe I screwed this up in breaking apart the tests.
> 
> Thanks again,
> Hal
> 
> > 
> > Ciao, Duncan.
> > 
> > >
> > > Thanks again,
> > > Hal
> > >
> > >>
> > >
> > 
> > 
> 
> --
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 

-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory