<br><br><div class="gmail_quote"><div dir="ltr">On Thu, Aug 17, 2017, 11:43 PM Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu">jan.vesely@rutgers.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, 2017-08-17 at 21:53 -0500, Aaron Watry wrote:<br>

> On Wed, Aug 16, 2017 at 6:21 PM, Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu" target="_blank">jan.vesely@rutgers.edu</a>> wrote:<br>

> > On Sat, 2017-08-12 at 13:18 -0500, Aaron Watry wrote:<br>

> > > On Fri, Aug 11, 2017 at 8:59 PM, Jan Vesely via Libclc-dev<br>

> > > <<a href="mailto:libclc-dev@lists.llvm.org" target="_blank">libclc-dev@lists.llvm.org</a>> wrote:<br>

> > > > Signed-off-by: Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu" target="_blank">jan.vesely@rutgers.edu</a>><br>

> > > > ---<br>

> > > ><br>

> > > > __builtin_amdgcn_s_waitcnt path is compile tested only. I currently<br>

> > > > don't have machine with GCN hw and LLVM > 4<br>

> > ><br>

> > > Tested on GCN 1.0 (PITCAIRN) with LLVM 6.0 svn (from sometime last week).<br>

> > ><br>

> > > I ran the local-memory/global-memory piglit tests, and the conformance<br>

> > > basic/test_basic vload*, vstore*, and barrier tests. No change in<br>

> > > pass-rate.<br>

> > ><br>

> > > The only ones that fail are vload_private with char/uchar/short/ushort<br>

> > > data types, but those failed before your series.<br>

> ><br>

> > do you remember if those failures are new in LLVM 6? I just posted new<br>

> > vload/vstore piglit tests, and those ran OK* on carrizo/iceland system<br>

> > using LLVM 5.<br>

> ><br>

> > *mostly. vload_half is missing from libclc to those test failed.<br>

> > Unlike Turks, which fails ~40% of them (LLVM 6).<br>

><br>

> I don't recall running this test specifically before the 5.0rc<br>

> branch-point, so I don't know if this is a new failure.  All of the<br>

> vload tests for local/global/constant seem ok, and vload private is<br>

> working for anything that has a 32-bit gentype or larger<br>

> (int/long/float/double). It's just the vloads of<br>

> char/uchar/short/ushort that are failing.  I've been looking at the<br>

> vload code, and at least the CL C code we've got looks correct.<br>

><br>

> The CTS test allows you to tweak the number of data points that are<br>

> tested in individual threads, and curiously, when the global size goes<br>

> from 15 to 16, things start failing (at least for the char2 test).<br>

<br>

If it's intermittent I'd suspect buffer manipulation rather than GPU<br>

execution, but I might be wrong.<br>

I mostly wanted to find out whether the new piglit tests hit the issue,<br>

or the situation is more complicated.<br>

Modified vstore tests managed to uncover fail in vstore-{u,}char-<br>

private, but that's assertion failure, which I assume is different from<br>

what you're seeing in CTS.<br></blockquote></div><div><br></div><div>Good point about the assertions, I don't know that I'm actually running a debug+assertions build right now.  That being said the failures aren't transient.  The piglit test I sent consistently passes and the cts one passes when working on ≤ 15 threads. It always fails at or over 16 threads...  I'll see if I can track it down at some point.  With libclc/clover there's enough bugs and missing features to chase down that I might just go for some other low-hanging fruit first.</div><div><br></div><div>Unfortunately I didn't get a chance to run those new tests last night. I'll let you know how it goes when I get a chance to try them.</div><div><br></div><div>--Aaron</div><div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<a href="http://paul.rutgers.edu/~jv356/piglit/gcn-latest-3/problems.html" rel="noreferrer" target="_blank">http://paul.rutgers.edu/~jv356/piglit/gcn-latest-3/problems.html</a><br>

<br>

><br>

> I've diffed the <a href="http://clover_dump.cl" rel="noreferrer" target="_blank">clover_dump.cl</a> and .ll files for both, and they're<br>

> identical.  I copied all of the test data into a piglit test, and the<br>

> same CL code and input/output data passes in piglit... The piglit CL<br>

> and .ll files are identical to what is executed by the CTS other than<br>

> the giant piglit comment section at the top related to buffer/test<br>

> setup.  Possibly a difference in API usage, or buffer alignments, or<br>

> something.<br>

><br>

> I'm not entirely sure that the vload code is to blame here, but it's<br>

> possible. It's working just fine for global/local/constant vload<br>

> (which doesn't rule out any weirdness with private memory in GCN<br>

> having possible alignment/register-size restrictions that I'm not<br>

> aware of).<br>

<br>

VI+ GCN parts support i16 instructions, and there was a similar bug<br>

before: <a href="https://reviews.llvm.org/D30281" rel="noreferrer" target="_blank">https://reviews.llvm.org/D30281</a><br>

<br>

> I think I'm going to spend a little more time working on<br>

> making sure that the create/write/read buffer pieces in clover (and<br>

> all of our synchronization pieces) are solid before I spend too much<br>

> more time here.<br>

<br>

thanks. clover on GCN is not really high on my list. my gcn machine is<br>

setup with rocm.<br>

Moreover, upgrading to LLVM 5 regressed both pyrite (can't select<br>

device) and bfgminer (flood of acpi errors in dmesg).<br>

<br>

><br>

> Ooh, fun fact...  I just tried to run the test on my Barts (6850/NI),<br>

> and it hung the machine on the first test.  At least my GCN just fails<br>

> the test...<br>

<br>

this looks related to the ongoing clpeak reported bug, the kernel<br>

driver is hit or miss when it comes to recovering hung GPU.<br>

<br>

><br>

> Anyway, I'll be out of communication for a few days.  New kid arriving<br>

> tomorrow, so I'll be buried in parents/in-laws and lacking in sleep.<br>

<br>

congratz, and no worries, there's no rush when it comes to clover :)<br>

<br>

><br>

> If you want to take a look at the .cl/ll files I used from piglit, I'm<br>

> attaching them.<br>

<br>

thanks,<br>

Jan<br>

<br>

><br>

> --Aaron<br>

><br>

> ><br>

> > regards,<br>

> > Jan<br>

> ><br>

> > [SNIP]<br>

<br>

--<br>

Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu" target="_blank">jan.vesely@rutgers.edu</a>></blockquote></div>