[PATCH] [OPENMP] Codegen for 'if' clause
Bataev, Alexey
a.bataev at hotmail.com
Wed Sep 17 22:17:42 PDT 2014
Hello John, Thanks you very much for the review! Here are my answers.
> I apologize for the outrageous delays; we’ve been a little busy.
Oh, don't worry. I was just afraid that you're completely done with the
clang because of swift. :-)
> The preconditions on this method aren’t clear. Does it *assume* that there’s
> only one clause or does it *check*? Right now, it seems to be assuming that
> it doesn’t have more than one, but checking that it doesn’t have zero, which
> is weird.
The precondition is quite simple: there must be exactly 0 or 1 clause of
the specified kind. This method returns the single clause if it exists.
Otherwise, if there is no such clause, it just returns nullptr.
> Elsewhere in the code, you’re conditionally checking whether the
> result is null. Is there a language rule that you can only have one if clause?
This method is required for clauses llike "if", "num_threads" etc. These
clauses may be specified only once per directive, but also they can be
skipped, if they are not required. So, directives may have 0 or 1
instance of such clauses.
> Just dereference the iterator here. Your assert can then increment the
> original iterator, and you don’t have to worry about copying the iterator in
> non-asserts builds.
Done, thanks.
> This subclass should go in a separate, OpenMP-specific header.
Ok, moved to CGOpenMPRuntime.h
> You seem to be inconsistent about the capitalization here, and on reflection,
> honestly, I think this abbreviation is completely illegible. And I’m not sure
> that calling it the “global thread ID” is actually meaningful when it’s clearly
> not global in any real sense.
>
> Let’s take everywhere you’ve got “Gtid” or “GTid”, including in the code
> already committed, and rename this to “ThreadID”. So this would be:
> const VarDecl *getThreadIDVariable() const { return ThreadIDVar; }
>
Ok, done.
> Also, this would be a great place to put a comment explaining what this
> variable actually is. For example, that it’s a variable of type ident_t*.
I'll modify the comment. The type of this variable is kmp_int32, i.e.
just int32.
> If this is an invariant of the CGOpenMPRegionInfo, shouldn’t it be asserted
> on in that structure, maybe in its constructor?
Moved to constructor.
> You have this logic repeated several times. It would be reasonable
> to have a
> LValue getThreadIDVariableLValue(CodeGenFunction &CGF);
> helper on the region info.
>
> Also, you seem to vary between using the type of the variable and assuming
> that ident_t == int32_t. It’s probably better to trust the type of the variable
> consistently.
>
Created this function, replaced the code by function call.
The type of ThreadID is int32_t, not ident_t. ident_t is a structure
that has info about debug location and some internal flags, ThreadID is
just a 32-bit integer. The variable is declared only in outlined
parallel region.
If the construct is not in parallel region (for example, there is a
standalone '#omp atomic' or other non-parallel directive), this variable
does not exist and we have to create a temp with type int32_t for this var.
> Please add a comment describing the intended behavior of this function and
> how its result will be used. The rule seems to be “if we’re inside a captured
> statement, use the region info’s thread-ID variable and ignore the thread ID
> we just computed; otherwise, stash that thread ID in a temporary and return
> the address of that”. That is a very strange rule. Describing *why* it does
> that would be invaluable to people trying to maintain this in the future.
>
Ok, I'll add the description.
> I’m not sure what rule you’re following for what goes in CGOpenMP vs.
> CGOpenMPRuntime. All of the code you’re emitting here is very
> runtime-specific. I think you could very easily abstract out a virtual
> interface for doing runtime-specific logic, GCXXABI-style. e.g. “emit
> a function for this captured statement”, “run this captured statement
> serially”, “run this captured statement in parallel”, etc. All of that could
> be in CGOpenMPRuntime.cpp, and you could keep the general logic for
> interpreting down statements in CGOpenMP.cpp.
I'll look at GCXXABI and will try do my best to improve CodeGen for
OpenMP, thanks.
> It would be better for this to not be templated and take a std::function.
Will do.
Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team
17.09.2014 12:36, John McCall пишет:
> On Jul 29, 2014, at 10:29 PM, Alexey Bataev <a.bataev at hotmail.com> wrote:
>> Hi doug.gregor, hfinkel, rjmccall, fraggamuffin, ejstotzer, rsmith,
>>
>> Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
>> If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
>> If condition evaluates to false, the code executes serial version of the code by executing the following code:
>>
>> global_thread_id.addr = alloca i32
>> store i32 global_thread_id, global_thread_id.addr
>> zero.addr = alloca i32
>> store i32 0, zero.addr
>> __kmpc_serialized_parallel(loc, global_thread_id);
>> microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/);
>> __kmpc_end_serialized_parallel(loc, global_thread_id);
>>
>> Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.
>>
>> Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
> I apologize for the outrageous delays; we’ve been a little busy.
>
> + /// \brief Gets single clause of the specified kind \a K associated with the
> + /// current directive iff there is only one clause of this kind.
>
> +const OMPClause *
> +OMPExecutableDirective::getSingleClause(OpenMPClauseKind K) const {
>
> The preconditions on this method aren’t clear. Does it *assume* that there’s
> only one clause or does it *check*? Right now, it seems to be assuming that
> it doesn’t have more than one, but checking that it doesn’t have zero, which
> is weird.
>
> Elsewhere in the code, you’re conditionally checking whether the
> result is null. Is there a language rule that you can only have one if clause?
>
> + auto ClauseFilter =
> + [=](const OMPClause *C) -> bool { return C->getClauseKind() == K; };
> + OMPExecutableDirective::filtered_clause_iterator<decltype(ClauseFilter)> I(
> + clauses(), ClauseFilter);
> +
> + if (I) {
> + auto PrevI = I;
>
> Just dereference the iterator here. Your assert can then increment the
> original iterator, and you don’t have to worry about copying the iterator in
> non-asserts builds.
>
> Index: lib/CodeGen/CodeGenFunction.h
> ===================================================================
> --- lib/CodeGen/CodeGenFunction.h
> +++ lib/CodeGen/CodeGenFunction.h
> @@ -224,6 +224,8 @@
> /// \brief Get the name of the capture helper.
> virtual StringRef getHelperName() const { return "__captured_stmt"; }
>
> + static bool classof(const CGCapturedStmtInfo *) { return true; }
> +
> private:
> /// \brief The kind of captured statement being generated.
> CapturedRegionKind Kind;
> @@ -238,6 +240,27 @@
> /// \brief Captured 'this' type.
> FieldDecl *CXXThisFieldDecl;
> };
> +
> + /// \brief API for captured statement code generation in OpenMP constructs.
> + class CGOpenMPRegionInfo : public CGCapturedStmtInfo {
>
> This subclass should go in a separate, OpenMP-specific header.
>
> + /// \brief Gets a variable or parameter for storing global thread id
> + /// inside OpenMP construct.
> + const VarDecl *getGTidVariable() const { return GTidVar; }
>
> You seem to be inconsistent about the capitalization here, and on reflection,
> honestly, I think this abbreviation is completely illegible. And I’m not sure
> that calling it the “global thread ID” is actually meaningful when it’s clearly
> not global in any real sense.
>
> Let’s take everywhere you’ve got “Gtid” or “GTid”, including in the code
> already committed, and rename this to “ThreadID”. So this would be:
> const VarDecl *getThreadIDVariable() const { return ThreadIDVar; }
>
> Also, this would be a great place to put a comment explaining what this
> variable actually is. For example, that it’s a variable of type ident_t*.
>
> + } else if (auto OMPRegionInfo =
> + dyn_cast_or_null<CodeGenFunction::CGOpenMPRegionInfo>(
> + CGF.CapturedStmtInfo)) {
> + assert(OMPRegionInfo->getGTidVariable() != nullptr &&
> + "No GTid variable for OpenMP region.”);
>
> If this is an invariant of the CGOpenMPRegionInfo, shouldn’t it be asserted
> on in that structure, maybe in its constructor?
>
> + auto GTidVar = OMPRegionInfo->getGTidVariable();
> + auto LVal = CGF.MakeNaturalAlignAddrLValue(
> + CGF.GetAddrOfLocalVar(GTidVar),
> + CGF.getContext().getPointerType(GTidVar->getType()));
>
> You have this logic repeated several times. It would be reasonable
> to have a
> LValue getThreadIDVariableLValue(CodeGenFunction &CGF);
> helper on the region info.
>
> Also, you seem to vary between using the type of the variable and assuming
> that ident_t == int32_t. It’s probably better to trust the type of the variable
> consistently.
>
> +static llvm::Value *EmitGTidAddress(CodeGenFunction &CGF, llvm::Value *GTid) {
>
> Please add a comment describing the intended behavior of this function and
> how its result will be used. The rule seems to be “if we’re inside a captured
> statement, use the region info’s thread-ID variable and ignore the thread ID
> we just computed; otherwise, stash that thread ID in a temporary and return
> the address of that”. That is a very strange rule. Describing *why* it does
> that would be invaluable to people trying to maintain this in the future.
>
> In general, this patch badly needs descriptive comments.
>
> static void EmitOMPSerialCall(CodeGenFunction &CGF,
> + const OMPParallelDirective &S,
> + llvm::Value *OutlinedFn,
> + llvm::Value *CapturedStruct) {
>
> I’m not sure what rule you’re following for what goes in CGOpenMP vs.
> CGOpenMPRuntime. All of the code you’re emitting here is very
> runtime-specific. I think you could very easily abstract out a virtual
> interface for doing runtime-specific logic, GCXXABI-style. e.g. “emit
> a function for this captured statement”, “run this captured statement
> serially”, “run this captured statement in parallel”, etc. All of that could
> be in CGOpenMPRuntime.cpp, and you could keep the general logic for
> interpreting down statements in CGOpenMP.cpp.
>
> +template <class CodeGenTy>
> +static void EmitConditionalCode(CodeGenFunction &CGF, const Expr *Cond,
> + CodeGenTy CodeGen) {
>
> It would be better for this to not be templated and take a std::function.
>
> John.
More information about the cfe-commits
mailing list