Reductions [patch 2 & 3]

jdoerfert at codeaurora.org jdoerfert at codeaurora.org
Sun Jun 15 14:45:23 PDT 2014


Comments inlined.

--

Johannes Doerfert
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

>>>>> +static cl::opt<bool, true>
>>>>> +XEnableReductions("polly-reductions",
>>>>> +                  cl::desc("Relax the reduction dependences before
>> scheduling"),
>>>>> +                  cl::Hidden, cl::location(polly::EnableReductions),
>>>>> +                  cl::init(false), cl::ZeroOrMore,
>>>>> +cl::cat(PollyCategory)); bool polly::EnableReductions = false;
>>>>
>>>> Should we enable this by default? Do we even need an option?
>> Without an updated code generation I would like to keep it under a flag.
>
> Why? This should not cause any correctness issues so enabling it gives
> us test coverage. Also, the schedule optimizer may already benefit from
> the increased freedom due to removing the reduction dependences.
It will affect correctness as soon as the vectorizer or parallelizer is
used, furthermore we need "privatization dependences" before we can safely
use it in a sequential environment. As soon as it is safe to enable it
(maybe only in sequential scenarios) I will.

>>>>> +  if (EnableReductions) {
>>>>> +    // Aggregate all possible reduction dependences (self
>>>>> dependences
>> on
>>>>> +    // reduction like statements). Then intersect them with the
>>>>> actual
>> RAW and
>>>>> +    // WAW dependences the get the actual reduction dependences.
>>>>
>>>> Can there be reduction WAW dependences? Should not all such
>>>> dependences
>> be RAW?
>> There are actually "more" WAW then RAW dependences.
>
> Yes, I just verified this myself and it seems both WAW and RAW
> dependences are computed.
See my last mail.

>  > Even for unary
>> reductions there will always be WAW dependences (but then no RAWs).
>> It should not be necessary to intersect them with RAW, my bad.
>
> In my previous mail I was thinking of intersecting them with the union
> of WAW and RAW dependences. Why is there no need to look at the RAW
> dependences at all?
I talked about it in the last mail. I looked at RAW only because I wanted
it to be restricted to binary reductions, I will from now work with both
(unary + binary) and look only at WAW dependences.

>>>> Also, we need to add dependences between S1 and all S2 as well as S2
>>>> and
>> all S3, no? Any idea how to do this?
>>>>
>>>> S1: A[0] = 0;
>>>>
>>>>       for (i)
>>>> S2:    A[0 += i;
>>>>
>>>> S3: A[0] *= 10;
>> Yes, I call/called them privatization dependences. I know how to do this
>> and
>> a patch is ready. However it is easier to do it on memory access level,
>> thus
>> it is included in the next group of patches.
>
> Why are they called privatization dependences? Even without
> privatization removing reduction dependences gives the scheduler more
> freedom. So they don't seem to be related to privatization.
Because they represent the dependences manual privatization would
introduce if performed at the outer most possible place (after the last
write and before the next access to the same location).

> Also, I think we should fix this in this very patch, as otherwise the
> patch is not self contained because the computed dependences are wrong.
Without a proper codegen they are wrong when parallel code generation is
enabled anyway, however I try to add them here too when I change the small
things.

>>>>>    void ScopStmt::checkForReduction() { diff --git
>>>>> a/test/Reductions/simplest_binary_only_reduction_like.ll
>>>>> b/test/Reductions/simplest_binary_only_reduction_like.ll
>>>>> new file mode 100644
>>>>> index 0000000..528d847
>>>>> --- /dev/null
>>>>> +++ b/test/Reductions/simplest_binary_only_reduction_like.ll
>>>>
>>>> Can we add those tests to test/Dependences?
>> I'm confused. I was going to (as you see already here) to reuse the
>> reduction tests for a couple of runs.
>> This would allow quite a bit of testing with less test cases.
>> Furthermore,
>> when we tailor for different kinds of
>> Reductions it is interesting to see not only the detection but also the
>> dependences and effects on the schedule.
>> I moved the first test cases to ScopInfo (as you said) and now I can
>> move
>> them to test/Dependences by duplicating them,
>> however I do not see the benefit of that.
>
> Yes, there are two competing goals. Not to duplicate test cases and to
> be able
> to run the test cases for a certain pass (e.g. just scop detection).
>
> LLVM in general and Polly itself have per-pass test cases. This
> introduces some redundancy in the test cases, but on the other side
> different passes normally require different test cases.
>
>> You have to tell me how you want to arrange these new test cases,
>> duplicate
>> them or use them only for one kind of test.
>> In the latter case, please also tell me how to decide for which.
>
> Please add per-pass test cases. For the dependence analysis changes,
> there is no need to duplicate all the ScopInfo tests that check e.g. the
> strict/relaxed floating point semantics. You can most likely copy one of
> the ScopInfo test cases to ensure we get the trivial reduction right.
Ok.

> And then this test case:
>
> S1: A[0] = 0;
>
>       for (i)
> S2:    A[0 += i;
>
> S3: A[0] *= 10;
I will.

>>>> Adding such a test case would be nice as well:
>>>>
>>>> S1: A[0] = 0;
>>>>
>>>>       for (i)
>>>> S2:    A[0 += i;
>>>>
>>>> S3: A[0] *= 10;
>> I was going to send the next set of patches (including one which adds
>> privatization dependences) as soon as we discussed (and accepted) these.
>> There will be a lot more test cases including ones similar to the one
>> above.
>
> As said above, if you have such a test case already please add it here
> (this is a two patch review, I am currently confused in which patch we
> are. Can you post the next versions of these patches in separate mails)
Ok.

>>>>>    class IslAstInfo : public ScopPass { diff --git
>>>>> a/include/polly/Dependences.h b/include/polly/Dependences.h index
>>>>> d44a905..00b18ee 100755
>>>>> --- a/include/polly/Dependences.h
>>>>> +++ b/include/polly/Dependences.h
>>>>> @@ -56,6 +56,9 @@ public:
>>>>>        // Write after write
>>>>>        TYPE_WAW = 0x4,
>>>>>
>>>>> +    // Reduction dependences
>>>>> +    TYPE_RED = 0x8,
>>>>> +
>>>>
>>>> This should be part of the previous patch, no?
>> There was no need to do that, this patch introduces the first "user" of
>> the
>> reduction dependences. Do I have to move it?
>
> If you prefer to keep it here, that's fine with me.
Ok.

>>>>>        // All dependences
>>>>>        TYPE_ALL = (TYPE_WAR | TYPE_RAW | TYPE_WAW)
>>>>
>>>> So reduction dependences are not part of all dependences?
>> Yes, so far.
>
>>>>   I can see that we do not want to look at them by default, but
>>>> calling
>> something all and not including them is surprising.
>> I think it still fits:
>> These are ALL dependences we have to fulfill during the scheduling and
>> we
>> have to consider for parallelism checks.
>> These are NOT ALL we need when we check if we need to privatize (and
>> because
>> privatization dependences are currently missing).
>
> I think it would be good to find some kind of classification of this
> following in some way your comments. However, we need to be careful to
> not talk about specific use cases, but to identify properties that make
> the classification clear.
>
> I don't have any good suggestions here. If you don't have any either,
> maybe just adding a comment explaining why ALL is not always ALL is
> enough to get this patch committed.
I will.

>>>>> @@ -151,22 +156,17 @@ static void freeIslAstUser(void *Ptr) {
>>>>>    // dimension if it is a subset of a map with equal values for the
>> current
>>>>>    // dimension.
>>>>>    static bool astScheduleDimIsParallel(__isl_keep isl_ast_build
>>>>> *Build,
>>>>> -                                     Dependences *D) {
>>>>> -  isl_union_map *Schedule, *Deps;
>>>>> +                                     __isl_take isl_union_map *Deps)
>>>>> + {  isl_union_map *Schedule;
>>>>>      isl_map *ScheduleDeps, *Test;
>>>>>      isl_space *ScheduleSpace;
>>>>>      unsigned Dimension, IsParallel;
>>>>>
>>>>> -  if (!D->hasValidDependences()) {
>>>>> -    return false;
>>>>> -  }
>>>>> -
>>>>>      Schedule = isl_ast_build_get_schedule(Build);
>>>>>      ScheduleSpace = isl_ast_build_get_schedule_space(Build);
>>>>>
>>>>>      Dimension = isl_space_dim(ScheduleSpace, isl_dim_out) - 1;
>>>>>
>>>>> -  Deps = D->getDependences(Dependences::TYPE_ALL);
>>>>>      Deps = isl_union_map_apply_range(Deps,
>> isl_union_map_copy(Schedule));
>>>>>      Deps = isl_union_map_apply_domain(Deps, Schedule);
>>>>>
>>>>> @@ -192,6 +192,20 @@ static bool astScheduleDimIsParallel(__isl_keep
>> isl_ast_build *Build,
>>>>>      return IsParallel;
>>>>>    }
>>>>>
>>>>> +static bool astScheduleDimIsParallel(__isl_keep isl_ast_build
>>>>> *Build,
>>>>> +                                     Dependences *D, bool
>>>>> +&IsReductionParallel) {
>>>>
>>>> Maybe a brief comment on top of the function.
>> I'll copy the other one :)
>
> Make sure to not add unnecessary redundancy.
Ok.

>>>>> +  if (!D->hasValidDependences())
>>>>> +    return false;
>>>>> +
>>>>> +  isl_union_map *Deps = D->getDependences(Dependences::TYPE_ALL);
>>>>> +  if (!astScheduleDimIsParallel(Build, Deps))
>>>>> +    return false;
>>>>> +  isl_union_map *RedDeps = D->getDependences(Dependences::TYPE_RED);
>>>>> +  if (!astScheduleDimIsParallel(Build, RedDeps))
>>>>> +    IsReductionParallel = true;
>>>>> +  return true;
>>>>> +}
>>>>
>>>> Did you think of using C++11 tuples for the two return values? (We
>>>> need
>> to check if the visual studio version LLVM targets supports them).
>> I don't know about C++11 tuples, but where do you want to use them
>> anyway?
>
> You are returning multiple values in astScheduleDimIsParallel(). Using
> tuples may make the code easier to understand:
>
> http://blog.paphus.com/blog/2012/07/25/tuple-and-tie/
>
> I don't have a strong opinion here. Please judge yourself if it is useful.
I will look into it.

>>>>> diff --git a/test/Reductions/simplest_binary_reduction_w_constant.ll
>>>>> b/test/Reductions/simplest_binary_reduction_w_constant.ll
>>>>> index c59e84f..722f7b3 100644
>>>>> --- a/test/Reductions/simplest_binary_reduction_w_constant.ll
>>>>> +++ b/test/Reductions/simplest_binary_reduction_w_constant.ll
>>>>
>>>> Make these test cases in the ast generator. We may duplicate certain
>> codes, but on the other side we can have minimal test cases for the
>> different use cases. Here we may e.g. want to later add more complex
>> loop
>> nests of scheduling choices which are not relevant to ScopInfo.
>> These partially answers my comment about test cases above, but I would
>> like
>> to know if I duplicate all or only certain test cases (and if so which).
>
> Only certain test cases. Precisely you copy or create test cases that
> test the features you are currently committing. One test case per item
> you want to test.
Ok.




More information about the llvm-commits mailing list