[Polly] Fix parallelism check for reductions
Johannes Doerfert
jdoerfert at codeaurora.org
Mon Jul 14 17:09:54 PDT 2014
r213019.
"Final comments" inlined.
>> // The node is parallel only if reductions are ignored or
>> privatized.
>>
>> Reductions don't make a loop parallel, no?
They can, or what would you say to:
for (i)
sum += A[i];
>> > /// @brief The type of the dependences.
>> > ///
>> > - /// Reduction dependences are seperated because they can be ignored
>> > during
>> > - /// the scheduling. This is the case since the order in which the
>> > reduction
>> > - /// statements are executed does not matter. However, if they are
>> > executed
>> > - /// in parallel we need to take additional measures (e.g.,
>> > privatization)
>> > - /// to ensure a correct result.
>> > + /// Reduction dependences are seperated from RAW/WAW/WAR
>> > + dependences because
>>
>> separated
Fixed.
>> > + /// we can ignore them during the scheduling. This is the case
>> > + since the order /// in which the reduction statements are executed
>> > + does not matter. However, /// if they are executed in parallel we
>> > + need to take additional measures /// (e.g, privatization) to ensure
>> > + a correct result. The (reverse) transitive /// closure of the
>> > + reduction dependences are used to check for parallel /// executed
reduction statements during code generation.
>> > + ///
>> > + /// FIXME: As soon as the code generation can handle reductions we
can get
>> > + /// rid of the current TYPE_RED dependences and rename the
TYPE_TC_RED
>> > + /// to TYPE_RED.
>>
>> I don't get this FIXME. What is needed in the code generation to remove
TC_RED? (You can just remove it, if you don't want to discuss this now).
I remove the fixme but the idea is that we won't need the current TYPE_RED
anymore.
>> > enum Type {
>> > // Write after read
>> > - TYPE_WAR = 0x1,
>> > + TYPE_WAR = 1 << 0,
>> >
>> > // Read after write
>> > - TYPE_RAW = 0x2,
>> > + TYPE_RAW = 1 << 1,
>> >
>> > // Write after write
>> > - TYPE_WAW = 0x4,
>> > + TYPE_WAW = 1 << 2,
>> >
>> > // Reduction dependences
>> > - TYPE_RED = 0x8,
>> > + TYPE_RED = 1 << 3,
>> > +
>> > + // Transitive closure of the reduction dependences (& the
>> > + reverse)
>>
>> Please describe how they look like, not how they are computed. Also, we
should mention here that thy are cyclic and possibly reversed.
Done in the comment above the enum.
>> > void Dependences::addPrivatizationDependences() {
>> > - isl_union_map *PrivRAW, *PrivWAW, *PrivWAR, *TransClosure;
>> > -
>> > - // The transitive closure might be over approximated but we only
>> > use it to
>> > - // compute the privatization dependences. Thus, overapproximation
>> > will lead
>> > - // "only" to more conservative privatization dependences.
>> > - // FIXME: Take precautions to ensure only forward dependences are
created.
>> > - TransClosure =
>> > isl_union_map_transitive_closure(isl_union_map_copy(RED), 0);
>> > + isl_union_map *PrivRAW, *PrivWAW, *PrivWAR;
>> > +
>> > + // The transitive closure might be over approximated, thus could
>> > + lead to // dependency cycles in the privatization dependences. To
>> > + make sure this // will not happen we remove all negative
>> > + dependences after we computed // the transitive closure.
>> > + TC_RED = isl_union_map_transitive_closure(isl_union_map_copy(RED),
>> > + 0);
>> > +
>> > + // FIXME: Apply the current schedule instead of assuming the
identity schedule
>> > + // here. The current approach is only valid as long as we
compute the
>> > + // dependences only with the initial (identity schedule). Any
other
>> > + // schedule could change "the direction of the backward
depenendes" we
>> > + // want to eliminate here.
>> > + isl_union_set *UDeltas =
>> > + isl_union_map_deltas(isl_union_map_copy(TC_RED));
>> > + isl_union_set *Universe =
>> > + isl_union_set_universe(isl_union_set_copy(UDeltas));
>> > + isl_union_set *Zero =
>> > + isl_union_set_empty(isl_union_set_get_space(Universe));
>> > + isl_union_set_foreach_set(Universe, fixSetToZero, &Zero);
>> > + isl_union_map *NonPositive = isl_union_set_lex_le_union_set(UDeltas,
>> > + Zero);
>>
>> Did you face any issues with applying the schedule? Or you just decided
to leave it for later.
I didn't look into it to be honest,... however I would like to change it
soon (I think it would help I want to introduce later on).
>> > +/// @brief Check if the current scheduling dimension is parallel
>> > +///
>> > +/// In case the dimension is parallel we also check if any reduction
>> > +/// dependences is broken when we exploit this parallelism. If so,
>> are
>> > +/// @p IsReductionParallel will be set to true. The reduction
dependences we use
>> is set
>> > +/// to check are actually the union of the transitive closure of the
initial
>> > +/// reduction dependences together with their reveresal. Even though
these
>> > +/// dependences connect all iterations with each other (thus they are
cyclic)
>> > +/// we can perform the parallelism check as we are only interested in
a zero
>> > +/// (or non-zero) dependence distance on the dimension in question.
>>
>> Otherwise looks good to me. Feel free to commit.
Nice.
Best regards,
Johannes
--
Johannes Doerfert
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation
More information about the llvm-commits
mailing list