[libcxx-commits] [PATCH] D121485: [libc++] Fix performance inconsistency between map/set copy-assignment and copy-constructor
Louis Dionne via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Fri May 5 12:46:51 PDT 2023
ldionne added inline comments.
================
Comment at: libcxx/benchmarks/map.bench.cpp:175
+ void run(benchmark::State& State) const {
+ auto Data = makeTestingSets(MapSize, Mode::Hit, Shuffle::None, 1);
+ auto& Map = Data.Maps.front();
----------------
EricWF wrote:
> var-const wrote:
> > Question: why are we using a different naming convention here (with capitalized variable `Name`s)?
> Probably because I started it because I was feeling funky. Not sure it matters much.
Yeah it's weird but I am staying consistent with the rest of the file.
================
Comment at: libcxx/benchmarks/map.bench.cpp:177
+ auto& Map = Data.Maps.front();
+ while (State.KeepRunningBatch(MapSize)) {
+#ifndef VALIDATE
----------------
EricWF wrote:
> I don't think this is the correct usage of `KeepRunningBatch`.
This is what I see in GoogleBenchmark:
```
// Returns true iff the benchmark should run n more iterations.
// REQUIRES: 'n' > 0.
// NOTE: A benchmark must not return from the test until KeepRunningBatch()
// has returned false.
// NOTE: KeepRunningBatch() may overshoot by up to 'n' iterations.
//
// Intended usage:
// while (state.KeepRunningBatch(1000)) {
// // process 1000 elements
// }
bool KeepRunningBatch(IterationCount n);
```
You're right that it's a bit weird since we're only "processing" one element here (aka we're doing the assignment of a single map). However, if I look at how all these benchmarks were written, I think the intent is that we pretend that copying each element in the map is "one iteration". I guess that's done so that the output of Google Benchmark can show the time taken on average for each element given a map with N elements in it. That's what we do in all the benchmarks in this file.
Anyway, I switched all the constructor-related benchmarks to use the pattern used in `ConstructorMove`, which gave me the most sensical and the most consistent results.
================
Comment at: libcxx/benchmarks/map.bench.cpp:179
+#ifndef VALIDATE
+ std::map<uint64_t, int64_t> M;
+ M = Map;
----------------
EricWF wrote:
> What about the case where the map is already populated? And with different sizes?
Added a benchmark.
================
Comment at: libcxx/include/__tree:1098
__tree(const __tree& __t);
- __tree& operator=(const __tree& __t);
template <class _ForwardIterator>
----------------
EricWF wrote:
> Mordante wrote:
> > var-const wrote:
> > > From the patch description: "In the future, we could instead reintroduce a copy assignment operator in
> > > tree and copy the whole tree without performing any comparison". Would it be worthwhile adding a TODO to that effect here?
> > +1 for the TODO.
> On another note: a copy assignment operator gets implicitly declared by the compiler. For readability lets make that implicit declaration explicit.
See https://github.com/llvm/llvm-project/issues/62571
================
Comment at: libcxx/include/__tree:1098
__tree(const __tree& __t);
- __tree& operator=(const __tree& __t);
template <class _ForwardIterator>
void __assign_unique(_ForwardIterator __first, _ForwardIterator __last);
----------------
ldionne wrote:
> EricWF wrote:
> > Mordante wrote:
> > > var-const wrote:
> > > > From the patch description: "In the future, we could instead reintroduce a copy assignment operator in
> > > > tree and copy the whole tree without performing any comparison". Would it be worthwhile adding a TODO to that effect here?
> > > +1 for the TODO.
> > On another note: a copy assignment operator gets implicitly declared by the compiler. For readability lets make that implicit declaration explicit.
> See https://github.com/llvm/llvm-project/issues/62571
Since there's a move constructor, it won't get declared. But I explicitly marked it as deleted.
================
Comment at: libcxx/include/set:594
{
- __tree_ = __s.__tree_;
+ if (this != _VSTD::addressof(__s)) {
+ __tree_.clear();
----------------
Mordante wrote:
> var-const wrote:
> > Question: it's `_VSTD` and not `std` for consistency within the file, right?
> These changes are quite unexpected when reading the patch description. Can you update the description?
> (Same for the multi(map|set).)
This is taken from the `__tree::operator=` code we were using before, so there are no changes really.
> Question: it's `_VSTD` and not `std` for consistency within the file, right?
Hmm, I'll use `std` anyway.
================
Comment at: libcxx/include/set:598
+ __tree_.__copy_assign_alloc(__s.__tree_);
+ insert(__s.begin(), __s.end());
+ }
----------------
EricWF wrote:
> What's the performance like if you call `__assign_unique(__s.begin(), __s.end())` instead?
It's better in case there's already elements in the tree, so I changed it to that instead. Good catch!
================
Comment at: libcxx/test/libcxx/containers/associative/perf.assignment_vs_construction.pass.cpp:3
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
----------------
EricWF wrote:
> var-const wrote:
> > A couple of questions:
> > - can/should we use Google Benchmark here?
> > - is this file compiled with optimizations enabled?
> Yes, this should be google benchmark.
I basically removed this test. The goal was to have a regression test that would catch when copy assignment and copy construction are not within a reasonable margin of each other, which we don't achieve with our current benchmarks. But I think it's fine to instead have just the benchmarks under `libcxx/benchmarks`.
================
Comment at: libcxx/test/libcxx/containers/associative/perf.assignment_vs_construction.pass.cpp:71
+ auto percentage = (std::labs((ssize_t)ctor_avg - (ssize_t)assignment_avg) / (double)ctor_avg) * 100.0;
+ assert(percentage < 10.0 && "Copy assignment and copy constructor have more than 10% difference");
+}
----------------
Mordante wrote:
> When this is the reason to not do it in Google Benchmark I think it's worthwhile to consider how we can test for timing regressions in the benchmarks. (This would even be good to look into regardless of this patch.)
Yes, I agree, however let's not block this patch on figuring out how to do performance regression testing.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D121485/new/
https://reviews.llvm.org/D121485
More information about the libcxx-commits
mailing list