[llvm] 30463bc - [SLP]Do not count perfect diamond matches for gathers several times.

Alexey.Bataev via llvm-commits llvm-commits at lists.llvm.org
Thu May 20 10:25:40 PDT 2021


Hi, thanks for the report. Yes, I will revert it and check what's the
cause of the problem here.

-------------
Best regards,
Alexey Bataev

5/20/2021 12:03 PM, Alexander Kornienko пишет:
> We see performance regressions after this patch. A number of
> benchmarks regressed for more than 10%. One example is the flops-6.c
> from the LLVM test-suite. An isolated test based on that benchmark:
>
> $ cat flops-6.c
> extern int printf (const char *__restrict __format, ...);
> double T[36];
> double sa,sb,sc,sd,one,two;
> double four,piref;
> double scale;
> double A1 = -0.1666666666671334;
> double A2 = 0.833333333809067E-2;
> double A3 = 0.198412715551283E-3;
> double A4 = 0.27557589750762E-5;
> double A5 = 0.2507059876207E-7;
> double A6 = 0.164105986683E-9;
> double B1 = -0.4999999999982;
> double B2 = 0.4166666664651E-1;
> double B3 = -0.1388888805755E-2;
> double B4 = 0.24801428034E-4;
> double B5 = -0.2754213324E-6;
> double B6 = 0.20189405E-8;
> int main()
> {
>    double s,u,v,w,x;
>    long loops;
>    register long i, m, n;
>    printf("\n");
>    printf("   FLOPS C Program (Double Precision), V2.0 18 Dec 1992\n\n");
>    loops = 15625;
>    piref = 3.14159265358979324;
>    one = 1.0;
>    two = 2.0;
>    four = 4.0;
>    scale = one;
>    printf("   Module     Error        RunTime      MFLOPS\n");
>    printf("                            (usec)\n");
>    m = loops*10000;
>    x = piref / ( four * (double)m );
>    s = 0.0;
>    v = 0.0;
>    for( i = 1 ; i <= m-1 ; i++ )
>    {
>    u = (double)i * x;
>    w = u * u;
>    v = u * ((((((A6*w+A5)*w+A4)*w+A3)*w+A2)*w+A1)*w+one);
>    s = s + v*(w*(w*(w*(w*(w*(B6*w+B5)+B4)+B3)+B2)+B1)+one);
>    }
>    u = piref / four;
>    w = u * u;
>    sa = u*((((((A6*w+A5)*w+A4)*w+A3)*w+A2)*w+A1)*w+one);
>    sb = w*(w*(w*(w*(w*(B6*w+B5)+B4)+B3)+B2)+B1)+one;
>    sa = sa * sb;
>    sa = x * ( sa + two * s ) / two;
>    sb = 0.25;
>    sc = sa - sb;
>    printf("     6   %13.4lf  %10.4lf  %10.4lf\n",
>           sc* 1e-30,
>           0* 1e-30 ,
>           0* 1e-30);
>    return 0;
> }
> $ clang-base -O3 -maes -m64 -mcx16 -msse4.2 -mpclmul
> '-mprefer-vector-width=128' flops-6.c -o flops-6-base
> $ clang-new -O3 -maes -m64 -mcx16 -msse4.2 -mpclmul
> '-mprefer-vector-width=128' flops-6.c -o flops-6-new
> $ for i in $(seq 5) ; do time ./flops-6-base ; done
>      6          0.0000      0.0000      0.0000
>
> real    0m0.705s
> user    0m0.700s
> sys     0m0.004s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.706s
> user    0m0.704s
> sys     0m0.001s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.706s
> user    0m0.705s
> sys     0m0.001s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.706s
> user    0m0.704s
> sys     0m0.001s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.707s
> user    0m0.705s
> sys     0m0.001s
> $ for i in $(seq 5) ; do time ./flops-6-new ; done
>      6          0.0000      0.0000      0.0000
>
> real    0m0.899s
> user    0m0.898s
> sys     0m0.000s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.899s
> user    0m0.898s
> sys     0m0.000s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.900s
> user    0m0.899s
> sys     0m0.000s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.899s
> user    0m0.898s
> sys     0m0.000s
>      6          0.0000      0.0000      0.0000
>
> real    0m0.899s
> user    0m0.898s
> sys     0m0.000s
>
> Can you take a look at this and maybe revert in the meantime?
>
> Thanks!
>
> -- Alex
>
> On Mon, May 10, 2021 at 4:10 PM Alexey Bataev via llvm-commits
> <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
>
>
>     Author: Alexey Bataev
>     Date: 2021-05-10T07:08:07-07:00
>     New Revision: 30463bc3f1839e8a238be4c137e2356f3cca2771
>
>     URL:
>     https://github.com/llvm/llvm-project/commit/30463bc3f1839e8a238be4c137e2356f3cca2771
>     <https://github.com/llvm/llvm-project/commit/30463bc3f1839e8a238be4c137e2356f3cca2771>
>     DIFF:
>     https://github.com/llvm/llvm-project/commit/30463bc3f1839e8a238be4c137e2356f3cca2771.diff
>     <https://github.com/llvm/llvm-project/commit/30463bc3f1839e8a238be4c137e2356f3cca2771.diff>
>
>     LOG: [SLP]Do not count perfect diamond matches for gathers several
>     times.
>
>     Need to remove the old code for avoiding double counting of the gather
>     nodes with perfect diamond matches within the tree after we started
>     detecting perfect/shuffled matching in the previous patch D100495. We
>     may skip the cost for such nodes completely.
>
>     Differential Revision: https://reviews.llvm.org/D102023
>     <https://reviews.llvm.org/D102023>
>
>     Added:
>
>
>     Modified:
>         llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
>         llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
>
>     Removed:
>
>
>
>     ################################################################################
>     diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
>     b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
>     index 22e090fd1d7c..e656b189c779 100644
>     --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
>     +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
>     @@ -4233,27 +4233,6 @@ InstructionCost BoUpSLP::getTreeCost() {
>        for (unsigned I = 0, E = VectorizableTree.size(); I < E; ++I) {
>          TreeEntry &TE = *VectorizableTree[I].get();
>
>     -    // We create duplicate tree entries for gather sequences that
>     have multiple
>     -    // uses. However, we should not compute the cost of duplicate
>     sequences.
>     -    // For example, if we have a build vector (i.e.,
>     insertelement sequence)
>     -    // that is used by more than one vector instruction, we only
>     need to
>     -    // compute the cost of the insertelement instructions once.
>     The redundant
>     -    // instructions will be eliminated by CSE.
>     -    //
>     -    // We should consider not creating duplicate tree entries for
>     gather
>     -    // sequences, and instead add additional edges to the tree
>     representing
>     -    // their uses. Since such an approach results in fewer total
>     entries,
>     -    // existing heuristics based on tree size may yield
>     diff erent results.
>     -    //
>     -    if (TE.State == TreeEntry::NeedToGather &&
>     -        std::any_of(std::next(VectorizableTree.begin(), I + 1),
>     -                    VectorizableTree.end(),
>     -                    [TE](const std::unique_ptr<TreeEntry>
>     &EntryPtr) {
>     -                      return EntryPtr->State ==
>     TreeEntry::NeedToGather &&
>     -                             EntryPtr->isSame(TE.Scalars);
>     -                    }))
>     -      continue;
>     -
>          InstructionCost C = getEntryCost(&TE);
>          Cost += C;
>          LLVM_DEBUG(dbgs() << "SLP: Adding cost " << C
>
>     diff  --git
>     a/llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
>     b/llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
>     index 31c63d31f4df..57db62ace206 100644
>     --- a/llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
>     +++ b/llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll
>     @@ -10,7 +10,7 @@ target triple = "aarch64--linux-gnu"
>      ; REMARK-LABEL: Function: gather_multiple_use
>      ; REMARK:       Args:
>      ; REMARK-NEXT:    - String: 'Vectorized horizontal reduction with
>     cost '
>     -; REMARK-NEXT:    - Cost: '-16'
>     +; REMARK-NEXT:    - Cost: '-7'
>      ;
>      ; REMARK-NOT: Function: gather_load
>
>
>
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>     <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210520/1c0d8e0f/attachment.html>


More information about the llvm-commits mailing list