[Openmp-dev] [fortran OpenMP hangs in the last subroutine] Race condition ?

John Mellor-Crummey via Openmp-dev openmp-dev at lists.llvm.org
Tue May 24 08:02:27 PDT 2016


My advice is to compile it -g with symbols for debugging, attach a debugger to the hanging process (gdb —pid the_pid_of_your_process), run the command “info thread” to see where each of your threads are, and look at the call stack for each thread to see what it is doing. Either all threads are doing something that causes them to loop infinitely, or most threads are at a barrier while one or more threads is computing infinitely or waiting for a condition that will never be satisfied.

Using a debugger to examine the problem in action and then speculating about the root causes is much better than speculating blind.
--
John Mellor-Crummey         Professor
Dept of Computer Science    Rice University
email: johnmc at rice.edu      phone: 713-348-5179

> On May 24, 2016, at 9:41 AM, David Van Aelst via Openmp-dev <openmp-dev at lists.llvm.org> wrote:
> 
> Hello from Toulouse, France, so please excuse my sometimes poor or hesitant English; I want to parallelize a fortran tool that post-processes finite elements stress results, and it does not yet work correctly; I already posted a message on this list, under the same title : [fortran OpenMP hangs in the last subroutine] <https://groups.google.com/forum/#!searchin/comp.lang.fortran/fortran$20OpenMP$20hangs$20in$20the$20last$20subroutine|>. The serial version of the tool is Ok, but a run lasts five days, and the machine should be able to reduce that to half a day with OpenMP.
> The problem is that the program enters an endless loop while ran on several processors, even with as few as a twentieth of the full data set that it should finally process. But it behaves perfectly well with an even smaller set of data.
> I suspect a race condition on one of the variables used by that subroutine, because it never enters that endless loop at the same step within this last subroutine.
> So, I would like to check before any assignation of any variable, if it would not use a memory location which would be already used elsewhere in a previous subroutine.
> Would it be better to scan every already used variable, whith the following logical function which checks if two pointers do have the same target : C_ASSOCIATED(C_PTR_1, C_PTR_2) with C_PTR_1 aiming at the variable to test, and C_PTR_2 aiming in a loop at all previously used variables.
> Or, I also imagine to create a C-function which would recieve a pointer to the new variable, and return its real memory location obtained by the "&" unary C-operator, and I could later check that memory location against all those that would have been already used.
> I hope this is clear enough for somebody to give me an advice,
> Thank you,
> David
> 
> !DSPAM:8504,574467d8149701257476417!
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> 
> !DSPAM:8504,574467d8149701257476417!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20160524/3d3d9990/attachment.html>


More information about the Openmp-dev mailing list