[Polly] Fix parallelism check for reductions

Johannes Doerfert jdoerfert at codeaurora.org
Wed Jul 9 16:32:12 PDT 2014


Hey,

We talked about this today and while I think it is correct I would like a
last comment here.
This patch basically is combined out of the [Fix] we discussed earlier and
the initial (and I think "accepted") isl ast patch.
I also added comments to parts which might be confusing or need attention in
the future.

What do you guys think?

Best regards,
  Johannes 

--

Johannes Doerfert
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation


-----Original Message-----
From: Tobias Grosser [mailto:tobias at grosser.es] 
Sent: Thursday, June 26, 2014 10:38 PM
To: Johannes Doerfert; llvm-commits at cs.uiuc.edu
Cc: 'Sebastian Pop'; 'zino'
Subject: Re: [Polly] Fix parallelism check for reductions

On 27/06/2014 01:47, Johannes Doerfert wrote:
> Thanks for the comments on this one, I added some myself again.
>
> Summarized I think we still need to figure out what a correct and good 
> way of doing this (but I think we are close).

Yes we are close.

>>> On 26/06/2014 02:24, Doerfert, Johannes wrote:
>>>> Hey Tobi,
>>>>
>>>> This is a patch we need to discuss in detail.
>>>> I basically tried to fix the loop reversal and modulo problems we
> discussed
>>>> earlier.
>>>>
>>>> Do you see any problems whit this way of check for broken dependences?
>>>
>>> I have some comments, but feel free to ignore what you believe is
> unnecessary.
> I'm not sure what's the best way to do this yet. My proposed way (TC 
> and RTC
> dependences) perform pretty well so far (I cannot find a schedule 
> which would not be parallelized because of overaproximation and I'm 
> confident they will always detect the need for privatization).


>>>> +    // Transitive closure of the reduction dependences
>>>> +    TYPE_TC_RED = 1 << 4,
>>>> +
>>>> +    // Reversed transitive closure of the reduction dependences
>>>> +    TYPE_RTC_RED = 1 << 5,
>>>
>>> Why do you expose these dependences at all?  I personally would have
> stored the
>>> union of those two in TYPE_RED and would have removed the negative
> dependences
>>> from TYPE_RED any time getDependences is called (using the schedule 
>>> to
> determin
>>> what negative means).
> I store them for now because TYPE_RED and TYPE_*TC_RED do not serve 
> the same
> purpose:
>    TYPE_RED is still used to force sequential execution since the code 
> generation
>             is not ready.
>    TYPE_*TC_RED is used to determine if parallel execution needs to be 
> augmented
>                 by privatization.
>
>>>> +/// @brief Fix all dimension of @p Zero to 0 and add it to @p user 
>>>> +static int fixSetToZero(__isl_take isl_set *Zero, void *user) {
>>>> +  isl_union_set **User = (isl_union_set **)user;
>>>> +  for (unsigned i = 0; i < isl_set_dim(Zero, isl_dim_set); i++)
>>>> +    Zero = isl_set_fix_si(Zero, isl_dim_set, i, 0);
>>>> +  *User = isl_union_set_add_set(*User, Zero);
>>>> +  return 0;
>>>> +}
>>>> +
>>>>    /// @brief Compute the privatization dependences for a given
> dependency @p
>>>>    Map
>>>>    ///
>>>>    /// Privatization dependences are widened original dependences 
>>>> which originate @@ -132,14 +141,27 @@ void 
>>>> Dependences::collectInfo(Scop &S, isl_union_map **Read,
>>>>    ///       S1[i0] -> S2[] : i0 >= 0 and i0 <= 1023}
>>>>    ///   RED:
>>>>    ///     { S1[i0] -> S1[1 + i0] : i0 >= 0 and i0 <= 1022 }
>>>> +///
>>>> +/// Note: This function also computes the (reverse) transitive 
>>>> +closure
> of the
>>>> +///       reduction dependences.
>>>>    void Dependences::addPrivatizationDependences() {
>>>> -  isl_union_map *PrivRAW, *PrivWAW, *PrivWAR, *TransClosure;
>>>> +  isl_union_map *PrivRAW, *PrivWAW, *PrivWAR;
>>>>
>>>>      // The transitive closure might be over approximated but we 
>>>> only
> use it to
>>>>      // compute the privatization dependences. Thus, 
>>>> overapproximation will lead
>>>> -  // "only" to more conservative privatization dependences.
>>>> -  // FIXME: Take precautions to ensure only forward dependences 
>>>> are
> created.
>>>> -  TransClosure =
>>>> isl_union_map_transitive_closure(isl_union_map_copy(RED), 0);
>>>> +  // "only" to more conservative privatization dependences. 
>>>> + However, we take  // precautions to ensure only forward 
>>>> + dependences are
> created.
>>>> +  TC_RED = 
>>>> + isl_union_map_transitive_closure(isl_union_map_copy(RED),
>>>> + 0);
>>>> +
>>>> +  isl_union_set *UDeltas =
>>>> + isl_union_map_deltas(isl_union_map_copy(TC_RED));
>>>> +  isl_union_set *Universe =
>>>> + isl_union_set_universe(isl_union_set_copy(UDeltas));
>>>> +  isl_union_set *Zero =
>>>> + isl_union_set_empty(isl_union_set_get_space(Universe));
>>>> +  isl_union_set_foreach_set(Universe, fixSetToZero, &Zero);
>>>
>>> As the schedule space is a isl_set, but not a union_set you may be 
>>> able
> to avoid
>>> this helper function.
> Is it a isl_set? Maybe I am confused right now but when I run 
> test/Dependences/reduction_multiple_reductions.ll I get:
> isl_union_set_dump(UDeltas);
>    { [Stmt_if_else[i0] -> MemRef_prod[0]] : i0 >= 1 and i0 <= 511; 
> [Stmt_if_then[i0] -> MemRef_sum[0]] : i0 >= 1 and i0 <= 511 } 
> isl_union_set_dump(Zero);
>    { [Stmt_if_else[0] -> MemRef_prod[0]]; [Stmt_if_then[0] -> 
> MemRef_sum[0]] }
> and:
>    isl_set_from_union_set(Zero);
> crashes.
>
> Did I understand you wrong?

I mean the space into which the schedule maps is a single space. Those are
the dependences to which the schedule was not yet applied.

>>>> +  isl_union_map *NonPositive =
>>>> + isl_union_set_lex_le_union_set(UDeltas, Zero);
>>>
>>> You do not involve the schedule here? Only after applying the 
>>> schedule
> you can
>>> know if dependences are negative in time.
> You mentioned that before but I still don't see why. I will try to 
> come up with a formal reason why it is right, but I don't think it is 
> ever "incorrect".
>
> Correctness [Do we __always__ privatize when we have to?]:
>    The *TC_RED dependences basically connect all reduction iterations 
> with each
>    other making it imposible to schedule any two at the same time without
>    violating at least one of these dependences.
>
> Tell me what you think about that "argument".

Right. This is more about removing negative dependences here. Is this not
what the lex_le_set is for? Maybe I misunderstood, but what is a negative
dependency for you. Can you explain?

>>>> @@ -371,7 +393,7 @@ bool 
>>>> Dependences::isParallelDimension(__isl_take
>>>> isl_set *ScheduleSubset,
>>>>
>>>>      // FIXME: We can remove ignore reduction dependences in case 
>>>> we
> privatize
>>>>      the
>>>>      //        memory locations the reduction statements reduce into.
>>>> -  Deps = getDependences(TYPE_ALL | TYPE_RED);
>>>> +  Deps = getDependences(TYPE_RAW | TYPE_WAW | TYPE_WAR | 
>>>> + TYPE_RED);
>>>
>>> Why do you use TYPE_RED here and not the TC_RED dependences?
>>>
>>> (I know having two parallelism checks is a bad idea)
> Because I did only ensure the isl codegeneration (In IslAst.h) does 
> "the right thing" while we annotate the isl ast optimistically with 
> "pragma * reduction".
>
> The above code is used by the cloog codegen. I use TYPE_RED to ensure 
> sequential execution (as explained earlier).

Did you test this with CLooG. I am afraid it may also introduce parallelism.

>>>> -  isl_union_map *RedDeps = 
>>>> D->getDependences(Dependences::TYPE_RED);
>>>> +  isl_union_map *RedDeps =
>>>> +      D->getDependences(Dependences::TYPE_TC_RED | 
>>>> + Dependences::TYPE_RTC_RED);
>>>
>>> Does this work if we reverse the dependences?
>>>
>>>    "schedule" : "[n] -> { Stmt_S0[i0] -> scattering[0, -i0, 0]: i0 % 
>>> 2 =
> 0;
>>>    Stmt_S0[i0] -> scattering[2, -i0, 0]: i0 % 2 = 1 }"
> It does, but wasn't that also one of the examples?
>
> I added some more possible schedules which all look good to me.
>
> The test cases combine loop reversals, "modulo schedules" and multiple 
> dimensions (some of them do not carry reduction dependences).

Good. Thanks for checking.

Tobias

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Annotate-reduction-parallel-loops-in-the-IslAst-text.patch
Type: application/octet-stream
Size: 77378 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140709/e29f9155/attachment.obj>


More information about the llvm-commits mailing list