[Openmp-dev] Runtime error when executing multiple target regions within a target data region

Joachim Protze via Openmp-dev openmp-dev at lists.llvm.org
Fri Jul 13 08:32:43 PDT 2018


On 07/13/2018 01:34 PM, Churbanov, Andrey wrote:
>> The arrays a and b are not mapped in the target region
> I am not an expert in offloading constructs, but what I see from the spec is that the arrays
> should be mapped in all target regions because of outer "target data" region.
> Correct me if I'm wrong here.
> 
>> Why does libomp have difficulties spawning teams after a while?
> I doubt we can create 256K threads, so the warning about this looks acceptable.
> We also have internal limit on the number of teams created - available number of procs.
> This can be overridden by KMP_TEAMS_THREAD_LIMIT environment variable.
> Though it is unclear to me why those zillions of threads are needed?  Even if we would be able
> to create 256K threads on 48 procs it is more than 5400 threads per proc -
> so huge oversubscription should cause awful performance of the test.

As Jonas already added, I was trying to offload to a GPU. To saturate 
the pipeline, I was trying to split the work into small pieces.

I was not complaining about the message comming from the host runtime 
when falling back onto the host. The problem is that the execution falls 
back to the host.

> 
> And the syntax of the test looks a bit broken to me, because the distribute or "parallel for"
> or "distribute parallel for" should be followed by a loop, while here it is followed by
> the compound statement.

You are right, for correct OpenMP code, I should remove the extra curly 
brackets around the for-loop.

> 
> Regards,
> Andrey
> 
> -----Original Message-----
> From: Openmp-dev [mailto:openmp-dev-bounces at lists.llvm.org] On Behalf Of Jonas Hahnfeld via Openmp-dev
> Sent: Friday, July 13, 2018 1:25 PM
> To: Joachim Protze <protze.joachim at gmail.com>
> Cc: Openmp-dev <openmp-dev at lists.llvm.org>
> Subject: Re: [Openmp-dev] Runtime error when executing multiple target regions within a target data region
> 
> Hi Joachim,
> 
> from internal discussion I know that you are targeting NVPTX and not
> x86_64 (aka "host offloading") - which is an important information in that case.
> 
> The arrays a and b are not mapped in the target region, so according to OpenMP 4.5, section 2.15.5 on page 215:
>> A variable that is of type pointer is treated as if it had appeared in
>> a map clause as a zero-length array section.
> 
> However, Clang currently doesn't seem to do that; manually adding map(a[0:0], b[0:0]) leads to the expected output:
> 256, 992, 0
> 0
> 256, 992, 0
> 1
> 256, 992, 0
> 2
> 256, 992, 0
> 3
> 256, 992, 0
> 4
> 256, 992, 0
> 5
> 256, 992, 0
> 6
> 256, 992, 0
> 7
> 256, 992, 0
> 8
> 256, 992, 0
> 9
> So the application has previously really been executing the fallback code on the host.
> (Note that your commented combined directive works fine with the standalone data directives, because it actually has the map-clause.)
> 
> Follow-up questions:
> 1) Why doesn't the fallback printf as well?
> 2) Why does libomp have difficulties spawning teams after a while?
> 
> Cheers,
> Jonas
> 
> On 2018-07-13 11:50, Joachim Protze via Openmp-dev wrote:
>> Hi all,
>>
>> we experience strange errors when we try to launch multiple target
>> regions within a data region, see attached code. The result when using
>> unstructured data mapping is similar. We are using clang built from
>> trunk this week.
>>
>> When we map the data for each iteration (as in line 23), the whole
>> code runs through. When we use a larger value for TEAMS, the execution
>> falls back to the host in an earlier iteration (for 1024 in the second
>> iteration instead of 7th as shown below).
>>
>> So, there seems to be an issue with the allocation of teams, when the
>> data region stays open. Any ideas, how this can be fixed?
>>
>> Best,
>> Joachim
>>
>>
>> Output when running the attached code (num_teams, thread_limit,
>> is_initial_device):
>>
>> 256, 992, 0
>> 0
>> 256, 992, 0
>> 1
>> 256, 992, 0
>> 2
>> 256, 992, 0
>> 3
>> 256, 992, 0
>> 4
>> 256, 992, 0
>> 5
>> 256, 992, 0
>> OMP: Warning #96: Cannot form a team with 256 threads, using 48
>> instead.
>> OMP: Hint Consider unsetting KMP_DEVICE_THREAD_LIMIT
>> (KMP_ALL_THREADS), KMP_TEAMS_THREAD_LIMIT, and OMP_THREAD_LIMIT (if
>> any are set).
>> 48, 2147483647, 1
>> 6
>> 48, 2147483647, 1
>> 7
>> 48, 2147483647, 1
>> 8
>> 48, 2147483647, 1
>> 9
>>
>> _______________________________________________
>> Openmp-dev mailing list
>> Openmp-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> _______________________________________________
> Openmp-dev mailing list
> Openmp-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> 
> --------------------------------------------------------------------
> Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 



More information about the Openmp-dev mailing list