[PATCH] D101844: [MicroBenchmarks] Add initial loop vectorization benchmarks.

Fri May 14 15:54:45 PDT 2021

Meinersbur added a comment.

I tried out the patch myself. It was consistently completing in about 37s, i.e about one second per benchmark.

Number of benchmarks: 36.
Google Benchmark MinTime default: 0.5s
Google Benchmark maximum walltime: 2.5s
That is, the total runtime ranges between 18s and 1.5 minutes.

Reading the minutes-long benchmark time, it seemed that each iteration would take longer than the 2.5s maximum wall clock time default.  In that case, Google Benchmark was unable to get stable statistics, possibly because a single iteration takes longer than 2.5s. This seems to be a good argument to reduce N, to giving Google Benchmark more leeway for statistics. In contrast to e.g. LoopInterchange, there is no need to be sufficiently large to make the cache hierarchy count.  Even on my system (Intel x84_64), it runs only does about 100 iterations per benchmarks, which seems low. Note that autovec and novec are approximately equally fast. `-ffast-math` finishes without error.

There is precedence with MemFunctions also running an even larger amount of micro benchmarks not protected by `TEST_SUITE_BENCHMARKING_ONLY`, so it might not be necessary here either.

================
Comment at: MicroBenchmarks/LoopVectorization/MathFunctions.cpp:39-41
+  std::unique_ptr<T[]> A(new T[N]);
+  std::unique_ptr<T[]> B(new T[N]);
+  std::unique_ptr<T[]> C(new T[N]);
----------------
```
/home/meinersbur/src/llvm-test-suite/MicroBenchmarks/LoopVectorization/MathFunctions.cpp:39:8: error: no member named 'unique_ptr' in namespace 'std'
  std::unique_ptr<T[]> A(new T[N]);
  ~~~~~^
```
(with libstdc++)

`#include <memory>` should be added.

Repository:
  rT test-suite

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101844/new/

https://reviews.llvm.org/D101844