[Openmp-dev] Local variables and yielding of untied tasks

Churbanov, Andrey via Openmp-dev openmp-dev at lists.llvm.org
Thu Jun 7 09:58:52 PDT 2018


> Just out of curiosity: is there a document describing the implementation
> of OpenMP tasks in Clang?
I am not aware of any documentation on the implementation internals.

> I have so far been assuming that local variables are situated on the stack
> of the task (i.e., the stack of the thread for Clang) and upon a taskyield
> the next task is simply put into the next stack frame on top of the yielding
> task. I don't quite understand why the local variable has to be explicitly captured.
For tied tasks keeping private variables on stack is preferred, but for untied tasks this is problematic.
I haven't got your idea on "the next stack frame on top of the yielding task".  The current implementation just schedules task continuation as a separate task and exits yielding task, so the stack is lost after that making previously initialized private variables inaccessible in the task continuation. If private variables are placed in heap memory and accessed indirectly, then different threads can work with the same locations when execute different parts of the same untied task.

Regards,
Andrey

-----Original Message-----
From: Joseph Schuchart [mailto:schuchart at hlrs.de] 
Sent: Monday, June 4, 2018 10:40 AM
To: Churbanov, Andrey <Andrey.Churbanov at intel.com>
Cc: openmp-dev <openmp-dev at lists.llvm.org>
Subject: Re: [Openmp-dev] Local variables and yielding of untied tasks

Done: https://bugs.llvm.org/show_bug.cgi?id=37671

Just out of curiosity: is there a document describing the implementation of OpenMP tasks in Clang? I have so far been assuming that local variables are situated on the stack of the task (i.e., the stack of the thread for Clang) and upon a taskyield the next task is simply put into the next stack frame on top of the yielding task. I don't quite understand why the local variable has to be explicitly captured. I guess my understanding of the implementation is grossly oversimplified and I would be grateful for some insights (I have looked at the libomp code but have not attempted to dug into the compiler side of life).

Thanks,
Joseph

On 06/01/2018 09:05 PM, Churbanov, Andrey wrote:
>> Should I file an issue for the missing capture of private variables?
> Yes, filing a Bugzilla would be good in order to track the issue.
> 
> For untied tasks lexically visible private variables should be captured similar to what done for shared variables in current implementation.  Though indirect accesses can be a bit slower, the correctness should be more important I think.
> 
> Regards,
> Andrey
> 
> -----Original Message-----
> From: Joseph Schuchart [mailto:schuchart at hlrs.de]
> Sent: Friday, June 1, 2018 2:13 PM
> To: Churbanov, Andrey <Andrey.Churbanov at intel.com>
> Cc: openmp-dev <openmp-dev at lists.llvm.org>
> Subject: Re: [Openmp-dev] Local variables and yielding of untied tasks
> 
> Andrey,
> 
> Thanks for your quick reply! You're right in that the printed value of flag_one_cntr should not be expected to continuously increment, I guess I was reading more into the output than was there.
> 
> I got confused with the Intel compiler, too. I was investigating another effect I was observing and tried to reproduce it using the Clang compiler. Eventually that turned out to be a non-issue though.
> 
> Should I file an issue for the missing capture of private variables?
> 
> Cheers,
> Joseph
> 
> On 05/30/2018 04:47 PM, Churbanov, Andrey wrote:
>> Hi Joseph,
>>
>> This looks like a bug in the clang' implementation of untied tasks - it does not capture the private variables locations, and thus the value of the "task_id" gets lost at each taskyield (where the thread exits a task in order to re-enter its continuation later on).
>>
>> Regarding missing printed values of shared flags, I don't see any problems here.  Your code allows to increment flag_one several times before each printing, so the number of prints should be stable, but the values can vary depending on how many additions will be executed before each print statement (implementation is allowed to postpone any task on taskyield and execute another task which increments flag and also can be postponed, etc.).
>>
>> And what problems you see with the Intel compiler?  AFAIK it always captures the location of local variables in tasks.
>>
>> Regards,
>> Andrey
>>
>> -----Original Message-----
>> From: Openmp-dev [mailto:openmp-dev-bounces at lists.llvm.org] On Behalf
>> Of Joseph Schuchart via Openmp-dev
>> Sent: Wednesday, May 30, 2018 4:50 PM
>> To: openmp-dev <openmp-dev at lists.llvm.org>
>> Subject: [Openmp-dev] Local variables and yielding of untied tasks
>>
>> I am experimenting with the behavior of taskyield under different OpenMP implementations and have written a test program that creates a set of untied tasks that each contain two taskyield regions. The program makes sure that only one thread is executing the relevant branch in the tasks by trapping all but thread 0 in a loop. Before the first yield, a shared variable is incremented and read into a local variable. After the first yield, the local variable is printed and a second taskyield is executed.
>> Eventually the threads are released as soon as all tasks have reached the second yield. The complete code is attached.
>>
>> I am observing some strange behavior with Clang 6.0 (built from release tarball, libomp built from master): the local variable contains the value zero where it should contain a value similar to flag_one_cntr:
>>
>> ```
>> [0] task_id 0 : flag_one_cntr 1 : flag_two_cntr 0 [0] task_id 0 :
>> flag_one_cntr 3 : flag_two_cntr 1 [0] task_id 0 : flag_one_cntr 4 :
>> flag_two_cntr 1 [0] task_id 0 : flag_one_cntr 5 : flag_two_cntr 3 ```
>> (notice that there appears to be no output for flag_one_cntr=2 and 7)
>>
>> Interestingly, the behavior seems correct if either of the taskyield
>> regions is commented out. Similarly, the behavior seems correct if the
>> tasks are tied (in which case there are no tasks to yield to). Using
>> tied tasks and disabling the task stealing constraint
>> (KMP_TASK_STEALING_CONSTRAINT=0) seems to provide the expected behavior:
>>
>> ```
>> [0] task_id 257 : flag_one_cntr 257 : flag_two_cntr 0 [0] task_id 256
>> : flag_one_cntr 257 : flag_two_cntr 1 [0] task_id 255 : flag_one_cntr
>> 257 : flag_two_cntr 2 [0] task_id 254 : flag_one_cntr 257 :
>> flag_two_cntr 3 [0] task_id 253 : flag_one_cntr 257 : flag_two_cntr 4
>> [0] task_id 252 : flag_one_cntr 257 : flag_two_cntr 5 ``` (the tasks
>> last put on the stack arrive at the printf first)
>>
>> Any idea what could be going wrong here and how to debug this further?
>> I tried different other implementations (incl. Cray, GCC, OmpSs) which
>> seem to work correctly (albeit with different behavior of taskyield).
>> I originally found this issue with the Intel 18.0.1 compiler.
>>
>> Any help would be much appreciated!
>>
>> Best regards,
>> Joseph
>>
> 
> --------------------------------------------------------------------
> Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


More information about the Openmp-dev mailing list