[libc-commits] [PATCH] D111554: [libc] automemcpy README and main include file

Clement Courbet via Phabricator via libc-commits libc-commits at lists.llvm.org
Tue Oct 12 02:50:06 PDT 2021


courbet added inline comments.


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:1
+//===-- Pod structs to describe a memory function----------------*- C++ -*-===//
+//
----------------
This is supposed to include the filename: https://llvm.org/docs/CodingStandards.html#file-headers


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:23
+
+#define COMPARABLE_AND_HASHABLE(T, ...)                                        \
+  inline auto asTuple() const { return std::tie(__VA_ARGS__); }                \
----------------
Do you also want to add a static_assert for PODness ?


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:26
+  bool operator==(const T &O) const { return asTuple() == O.asTuple(); }       \
+  bool operator<(const T &O) const { return asTuple() < O.asTuple(); }         \
+  struct Hasher {                                                              \
----------------
Can you comment on what the comparison is used for ?


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:33
+
+static constexpr int kMaxSize = INT_MAX;
+
----------------
doc ?


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:35
+
+enum class AlignArg { _1, _2, ARRAY_SIZE };
+
----------------
doc ?


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:52
+// if(size == 1) return Handle<1>();
+struct Contiguous {
+  SizeSpan Span;
----------------
Should the naming be `ContiguousStrategy`, `OverlapStrategy`, ... ?

I'm seeing this as describing a strategy to be applied to a given size range.


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:58-59
+
+// This struct represents a range of sizes over which to use overlapping
+// strategies. An overlapping strategy of size N handles all sizes from N to
+// 2xN. Thespan may represent several contiguous overlaps.
----------------
"an overlapping strategy" ?


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:60
+// strategies. An overlapping strategy of size N handles all sizes from N to
+// 2xN. Thespan may represent several contiguous overlaps.
+// e.g. with Span = {16, 128};
----------------
"The span"


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:80
+
+// Same as Loop but starts by aligning a buffer on Alignement.
+struct AlignedLoop {
----------------
How ? (I mean which strategy)


================
Comment at: libc/benchmarks/automemcpy/FunctionDescriptor.h:117-118
+// every details but is enough to uniquely identify the implementation. The
+// fields are ordered by priority. i.e. If several spans handle a particular
+// size, the first one handles it.
+struct FunctionDescriptor {
----------------
Should that be an error ? The Z3 model should not be generating these given the constraints, right ?


================
Comment at: libc/benchmarks/automemcpy/README.md:2
+This folder contains an implementation of [automemcpy: A framework for automatic generation of fundamental memory operations](https://research.google/pubs/pub50338/).
+
+It uses the [Z3 theorem prover](https://github.com/Z3Prover/z3) to enumerate a subset of valid memory function implementations. These implementations are then materialized as C++ code and can be [benchmarked](../) against various [size distributions](../distributions). This process helps the design of efficient implementations for a particular environnement (size distribution, processor or custom compilation options).
----------------
Maybe make it clear that this is not built be default:

```
This is not enabled by default, as it is mostly useful when working on tuning the library implementation. To build it, use `LIBC_BUILD_AUTOMEMCPY=ON`.
```


================
Comment at: libc/benchmarks/automemcpy/README.md:3
+
+It uses the [Z3 theorem prover](https://github.com/Z3Prover/z3) to enumerate a subset of valid memory function implementations. These implementations are then materialized as C++ code and can be [benchmarked](../) against various [size distributions](../distributions). This process helps the design of efficient implementations for a particular environnement (size distribution, processor or custom compilation options).
+
----------------
Did you mean to give a real link here ?


================
Comment at: libc/benchmarks/automemcpy/README.md:41
+    - runs `Z3` and materializes valid memory functions as C++ code, a message will display its ondisk location.
+    - the source code is then compiled using the native host optimizations.
+ 2. `automemcpy`
----------------
 can you explain more about this ?  (in particular, that the way to express this depends on the target)


================
Comment at: libc/benchmarks/automemcpy/README.md:72
+
+Other options might be useful, use `--help` for more informations.
+
----------------
information


================
Comment at: libc/benchmarks/automemcpy/README.md:76
+
+Analysis if performed by running `automemcpy_result_analyser` on one or more json result files.
+
----------------
is



================
Comment at: libc/benchmarks/automemcpy/README.md:83
+What it does:
+  1. Gathers all throughput values for each function / distribution pair and pick the median one.\
+  This allows picking a representative value over many runs of the benchmark. Please make sure all the runs happen under similar circumstances.
----------------
picks


================
Comment at: libc/benchmarks/automemcpy/README.md:86
+
+  2. For each distribution, look at the span of throughputs for functions of the same type (e.g. For distribution A, memcpy throughput spans from 2GiB/s to 5GiB/s).
+
----------------
`A`


================
Comment at: libc/benchmarks/automemcpy/README.md:88
+
+  3. For each distribution, give a normalized score to each function (e.g. For distribution A, function M scores 0.65).\
+  This score is then turned into a grade `EXCELLENT`, `VERY_GOOD`, `GOOD`, `PASSABLE`, `INADEQUATE`, `MEDIOCRE`, `BAD` - so that each distribution categorizes how function perform according to them.
----------------
`A`, `M`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111554/new/

https://reviews.llvm.org/D111554



More information about the libc-commits mailing list