[Openmp-dev] Local variables and yielding of untied tasks
Joseph Schuchart via Openmp-dev
openmp-dev at lists.llvm.org
Fri Jun 1 04:12:39 PDT 2018
Thanks for your quick reply! You're right in that the printed value of
flag_one_cntr should not be expected to continuously increment, I guess
I was reading more into the output than was there.
I got confused with the Intel compiler, too. I was investigating another
effect I was observing and tried to reproduce it using the Clang
compiler. Eventually that turned out to be a non-issue though.
Should I file an issue for the missing capture of private variables?
On 05/30/2018 04:47 PM, Churbanov, Andrey wrote:
> Hi Joseph,
> This looks like a bug in the clang' implementation of untied tasks - it does not capture the private variables locations, and thus the value of the "task_id" gets lost at each taskyield (where the thread exits a task in order to re-enter its continuation later on).
> Regarding missing printed values of shared flags, I don't see any problems here. Your code allows to increment flag_one several times before each printing, so the number of prints should be stable, but the values can vary depending on how many additions will be executed before each print statement (implementation is allowed to postpone any task on taskyield and execute another task which increments flag and also can be postponed, etc.).
> And what problems you see with the Intel compiler? AFAIK it always captures the location of local variables in tasks.
> -----Original Message-----
> From: Openmp-dev [mailto:openmp-dev-bounces at lists.llvm.org] On Behalf Of Joseph Schuchart via Openmp-dev
> Sent: Wednesday, May 30, 2018 4:50 PM
> To: openmp-dev <openmp-dev at lists.llvm.org>
> Subject: [Openmp-dev] Local variables and yielding of untied tasks
> I am experimenting with the behavior of taskyield under different OpenMP implementations and have written a test program that creates a set of untied tasks that each contain two taskyield regions. The program makes sure that only one thread is executing the relevant branch in the tasks by trapping all but thread 0 in a loop. Before the first yield, a shared variable is incremented and read into a local variable. After the first yield, the local variable is printed and a second taskyield is executed.
> Eventually the threads are released as soon as all tasks have reached the second yield. The complete code is attached.
> I am observing some strange behavior with Clang 6.0 (built from release tarball, libomp built from master): the local variable contains the value zero where it should contain a value similar to flag_one_cntr:
>  task_id 0 : flag_one_cntr 1 : flag_two_cntr 0  task_id 0 : flag_one_cntr 3 : flag_two_cntr 1  task_id 0 : flag_one_cntr 4 : flag_two_cntr 1  task_id 0 : flag_one_cntr 5 : flag_two_cntr 3 ``` (notice that there appears to be no output for flag_one_cntr=2 and 7)
> Interestingly, the behavior seems correct if either of the taskyield regions is commented out. Similarly, the behavior seems correct if the tasks are tied (in which case there are no tasks to yield to). Using tied tasks and disabling the task stealing constraint
> (KMP_TASK_STEALING_CONSTRAINT=0) seems to provide the expected behavior:
>  task_id 257 : flag_one_cntr 257 : flag_two_cntr 0
>  task_id 256 : flag_one_cntr 257 : flag_two_cntr 1
>  task_id 255 : flag_one_cntr 257 : flag_two_cntr 2
>  task_id 254 : flag_one_cntr 257 : flag_two_cntr 3
>  task_id 253 : flag_one_cntr 257 : flag_two_cntr 4
>  task_id 252 : flag_one_cntr 257 : flag_two_cntr 5
> (the tasks last put on the stack arrive at the printf first)
> Any idea what could be going wrong here and how to debug this further? I
> tried different other implementations (incl. Cray, GCC, OmpSs) which
> seem to work correctly (albeit with different behavior of taskyield). I
> originally found this issue with the Intel 18.0.1 compiler.
> Any help would be much appreciated!
> Best regards,
More information about the Openmp-dev