[Openmp-dev] Debug assert trigger in OpenMP + MPI
Joachim Protze via Openmp-dev
openmp-dev at lists.llvm.org
Fri May 22 13:11:50 PDT 2020
I looked a bit into this issue, because I ran into the same issue with a
blocked cholesky factorization code this week. Thanks for providing this
reproducer!
I think, the bookkeeping of task queues is broken, so that under certain
conditions, the tail/head marker is not updated correctly.
In addition to the assertion, I see regular stalling in the runtime.
I'm not convinced, that TCW_4 &Co have any effect in current builds.
Therefore, I think, that the compiler might move the accesses to the
head/tail counters out of the locked region?!?
For testing purposes, I added KMP_MB() after locking and before
unlocking the lock function (kmp_tasking.diff). Or should this actually
become a part of the locking function (kmp_lock.diff)?!?
I did no performance tests of those change, but the latter solution
fixed my stalls as well as the spurious assertion violations.
Best
Joachim
Am 21.05.20 um 18:38 schrieb Johannes Doerfert via Openmp-dev:
>
> On 5/21/20 11:12 AM, Raúl Peñacoba Veigas via Openmp-dev wrote:
>> Hello again,
>>
>> I've managed to remove MPI from the equation. It seems a race
>> condition in the runtime.
>>
>> int main(int argc, char **argv)
>> {
>> int TIMESTEPS = 10;
>> int BLOCKS = 100;
>>
>> int nranks = 4;
>>
>> int DATA;
>>
>> #pragma omp parallel
>> #pragma omp single
>> {
>> for (int t = 0; t < TIMESTEPS; ++t) {
>> for (int r = 0; r < nranks; ++r) {
>> for (int b = 0; b < BLOCKS; ++b) {
>> #pragma omp task depend(in: DATA)
>> { }
>> }
>> }
>>
>> #pragma omp task depend(inout: DATA)
>> {}
>> }
>> #pragma omp taskwait
>> }
>> }
>>
>> To run it execute:
>>
>> clang -fopenmp t1.c -o t1
>>
>> for i in {1..5000}; do echo $i; OMP_NUM_THREADS=3 ./t1; done
>>
> Thanks for the reproducer! We might need to file a bug report for this one
>
> but maybe someone will pick it up from here, let's wait a little while :)
>
>
>
>> Regards,
>> Raúl
>>
>> El 21/5/20 a las 9:57, Raúl Peñacoba Veigas escribió:
>>> Hello everyone,
>>>
>>> Writing an OpenMP + MPI code I've triggered a debug assert in
>>> __kmp_task_start:
>>>
>>> KMP_DEBUG_ASSERT(taskdata->td_flags.tasktype == TASK_EXPLICIT);
>>>
>>> I attach a simpler code that does not do anything special with
>>> additional info.
>>>
>>> #include <mpi.h>
>>>
>>> #include <assert.h>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <string.h>
>>> #include <unistd.h>
>>>
>>> int main(int argc, char **argv)
>>> {
>>> int TIMESTEPS = 10;
>>> int BLOCKS = 100;
>>>
>>> MPI_Init(&argc, &argv);
>>>
>>> int rank, nranks;
>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>> MPI_Comm_size(MPI_COMM_WORLD, &nranks);
>>>
>>> int DATA;
>>>
>>> #pragma omp parallel
>>> #pragma omp single
>>> {
>>> for (int t = 0; t < TIMESTEPS; ++t) {
>>> for (int r = 0; r < nranks; ++r) {
>>> for (int b = 0; b < BLOCKS; ++b) {
>>> #pragma omp task depend(in:
>>> DATA)
>>> { }
>>> }
>>> }
>>>
>>> #pragma omp task depend(inout: DATA)
>>> {}
>>> }
>>> #pragma omp taskwait
>>> }
>>>
>>> MPI_Finalize();
>>>
>>> }
>>>
>>> llvm_project debug build, commitaafdeeade8d
>>> MPICH Version: 3.3a2
>>> MPICH Release date: Sun Nov 13 09:12:11 MST 2016
>>>
>>> $ MPICH_CC=clang mpicc -fopenmp t1.c -o t1
>>> $ for i in {1..100}; do mpiexec.hydra -n 4 ./t1; done
>>>
>>>
>>
>> http://bsc.es/disclaimer
>> _______________________________________________
>> Openmp-dev mailing list
>> Openmp-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kmp_tasking.diff
Type: text/x-patch
Size: 2812 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200522/48155906/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kmp_lock.diff
Type: text/x-patch
Size: 862 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200522/48155906/attachment-0001.bin>
More information about the Openmp-dev
mailing list