[Openmp-dev] Local variables and yielding of untied tasks
Joseph Schuchart via Openmp-dev
openmp-dev at lists.llvm.org
Mon Jun 4 00:40:05 PDT 2018
Just out of curiosity: is there a document describing the implementation
of OpenMP tasks in Clang? I have so far been assuming that local
variables are situated on the stack of the task (i.e., the stack of the
thread for Clang) and upon a taskyield the next task is simply put into
the next stack frame on top of the yielding task. I don't quite
understand why the local variable has to be explicitly captured. I guess
my understanding of the implementation is grossly oversimplified and I
would be grateful for some insights (I have looked at the libomp code
but have not attempted to dug into the compiler side of life).
On 06/01/2018 09:05 PM, Churbanov, Andrey wrote:
>> Should I file an issue for the missing capture of private variables?
> Yes, filing a Bugzilla would be good in order to track the issue.
> For untied tasks lexically visible private variables should be captured similar to what done for shared variables in current implementation. Though indirect accesses can be a bit slower, the correctness should be more important I think.
> -----Original Message-----
> From: Joseph Schuchart [mailto:schuchart at hlrs.de]
> Sent: Friday, June 1, 2018 2:13 PM
> To: Churbanov, Andrey <Andrey.Churbanov at intel.com>
> Cc: openmp-dev <openmp-dev at lists.llvm.org>
> Subject: Re: [Openmp-dev] Local variables and yielding of untied tasks
> Thanks for your quick reply! You're right in that the printed value of flag_one_cntr should not be expected to continuously increment, I guess I was reading more into the output than was there.
> I got confused with the Intel compiler, too. I was investigating another effect I was observing and tried to reproduce it using the Clang compiler. Eventually that turned out to be a non-issue though.
> Should I file an issue for the missing capture of private variables?
> On 05/30/2018 04:47 PM, Churbanov, Andrey wrote:
>> Hi Joseph,
>> This looks like a bug in the clang' implementation of untied tasks - it does not capture the private variables locations, and thus the value of the "task_id" gets lost at each taskyield (where the thread exits a task in order to re-enter its continuation later on).
>> Regarding missing printed values of shared flags, I don't see any problems here. Your code allows to increment flag_one several times before each printing, so the number of prints should be stable, but the values can vary depending on how many additions will be executed before each print statement (implementation is allowed to postpone any task on taskyield and execute another task which increments flag and also can be postponed, etc.).
>> And what problems you see with the Intel compiler? AFAIK it always captures the location of local variables in tasks.
>> -----Original Message-----
>> From: Openmp-dev [mailto:openmp-dev-bounces at lists.llvm.org] On Behalf
>> Of Joseph Schuchart via Openmp-dev
>> Sent: Wednesday, May 30, 2018 4:50 PM
>> To: openmp-dev <openmp-dev at lists.llvm.org>
>> Subject: [Openmp-dev] Local variables and yielding of untied tasks
>> I am experimenting with the behavior of taskyield under different OpenMP implementations and have written a test program that creates a set of untied tasks that each contain two taskyield regions. The program makes sure that only one thread is executing the relevant branch in the tasks by trapping all but thread 0 in a loop. Before the first yield, a shared variable is incremented and read into a local variable. After the first yield, the local variable is printed and a second taskyield is executed.
>> Eventually the threads are released as soon as all tasks have reached the second yield. The complete code is attached.
>> I am observing some strange behavior with Clang 6.0 (built from release tarball, libomp built from master): the local variable contains the value zero where it should contain a value similar to flag_one_cntr:
>>  task_id 0 : flag_one_cntr 1 : flag_two_cntr 0  task_id 0 :
>> flag_one_cntr 3 : flag_two_cntr 1  task_id 0 : flag_one_cntr 4 :
>> flag_two_cntr 1  task_id 0 : flag_one_cntr 5 : flag_two_cntr 3 ```
>> (notice that there appears to be no output for flag_one_cntr=2 and 7)
>> Interestingly, the behavior seems correct if either of the taskyield
>> regions is commented out. Similarly, the behavior seems correct if the
>> tasks are tied (in which case there are no tasks to yield to). Using
>> tied tasks and disabling the task stealing constraint
>> (KMP_TASK_STEALING_CONSTRAINT=0) seems to provide the expected behavior:
>>  task_id 257 : flag_one_cntr 257 : flag_two_cntr 0  task_id 256
>> : flag_one_cntr 257 : flag_two_cntr 1  task_id 255 : flag_one_cntr
>> 257 : flag_two_cntr 2  task_id 254 : flag_one_cntr 257 :
>> flag_two_cntr 3  task_id 253 : flag_one_cntr 257 : flag_two_cntr 4
>>  task_id 252 : flag_one_cntr 257 : flag_two_cntr 5 ``` (the tasks
>> last put on the stack arrive at the printf first)
>> Any idea what could be going wrong here and how to debug this further?
>> I tried different other implementations (incl. Cray, GCC, OmpSs) which
>> seem to work correctly (albeit with different behavior of taskyield).
>> I originally found this issue with the Intel 18.0.1 compiler.
>> Any help would be much appreciated!
>> Best regards,
> Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
More information about the Openmp-dev