[cfe-dev] [OpenMP]: wrong threadprivate storage name in task reduction

Raúl Peñacoba via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 8 04:47:13 PST 2018


Hi,

I did some little research and I have realized that the patch I attached 
doesn't fix the problem completely. I fixed the storage of the data. 
When a using task reduction with vla, or a class that uses omp_orig to 
initialize the private copies, the size of the vla and omp_orig are 
stored in a storage using __kmpc_threadprivate_cached().

The problem comes because, since in outline parallel function each 
reduction has lazy_priv flag, the initialization will be done in 
__kmpc_task_reduction_get_th_data(), called from outline task function. 
This is done _before_storing the vla size or omp_orig, so the 
initialization would be wrong in these cases.

I think that if the storage is put before the 
__kmpc_task_reduction_get_th_data() it should work.

Anyway, this implementation of task reductions is not fully functional, 
as only are allowed if the task is created in the scope of taskgroup.

Sorry for the mail duplicate.

Regards,

Raúl

El 06/03/18 a las 16:21, Alexey Bataev escribió:
>
> Hi, thanks for the report, I'll take a look.
>
> -------------
> Best regards,
> Alexey Bataev
> 06.03.2018 4:25, Raúl Peñacoba via cfe-dev пишет:
>> Hi,
>>
>> I've noticed that clang does not work properly when using task 
>> reductions with non-constant arrays and/or classes with omp_orig 
>> constructor initializers.
>>
>> As reduction initialize/combine/finalize functions only receive 
>> references to whatever should be initialized, combined..., clang 
>> generates some additional storages to save the size of the reduction 
>> array and a pointer to the omp_orig. That storages are generated 
>> through __kmpc_thread_private_cached() calls in both reduction 
>> functions and outline task function.
>>
>> The problem is that the storage used in outline task function does 
>> not match the storage in reduction functions. This happens in 
>> programs that have a task reduction or a taskloop reduction with the 
>> in_reduction clause.
>>
>> To solve this i made an ugly patch that reuses the SourceLocation of 
>> CodeGenFunction::EmitOMPTaskgroupDirective() in 
>> CodeGenFunction::EmitOMPTaskBasedDirective() to match the storages 
>> and make it work. Anyway, i suppose that there is a better way to fix 
>> this.
>>
>> Index: CGStmtOpenMP.cpp
>> ===================================================================
>> --- CGStmtOpenMP.cpp    (revision 326685)
>> +++ CGStmtOpenMP.cpp    (working copy)
>> @@ -2713,6 +2713,8 @@
>> emitEmptyBoundParameters);
>>  }
>>
>> +static SourceLocation reductionLoc;
>> +
>>  void CodeGenFunction::EmitOMPTaskBasedDirective(
>>      const OMPExecutableDirective &S, const OpenMPDirectiveKind 
>> CapturedRegion,
>>      const RegionCodeGenTy &BodyGen, const TaskGenTy &TaskGen,
>> @@ -2952,7 +2954,7 @@
>>          // FIXME: This must removed once the runtime library is fixed.
>>          // Emit required threadprivate variables for
>>          // initilizer/combiner/finalizer.
>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, 
>> S.getLocStart(),
>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionFixups(CGF, reductionLoc,
>> RedCG, Cnt);
>>        }
>>      }
>> @@ -3180,8 +3182,9 @@
>>            std::advance(IRHS, 1);
>>          }
>>        }
>> +      reductionLoc = S.getLocStart();
>>        llvm::Value *ReductionDesc =
>> - CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, S.getLocStart(),
>> + CGF.CGM.getOpenMPRuntime().emitTaskReductionInit(CGF, reductionLoc,
>> LHSs, RHSs, Data);
>>        const auto *VD = cast<VarDecl>(cast<DeclRefExpr>(E)->getDecl());
>>        CGF.EmitVarDecl(*VD);
>>
>> Regards,
>>
>> Raúl
>>
>>
>> http://bsc.es/disclaimer
>>
>



http://bsc.es/disclaimer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180308/cce6b34e/attachment.html>


More information about the cfe-dev mailing list