[libcxx-commits] [PATCH] D121074: [libc++] Implement std::boyer_moore{, _horspool}_searcher
Arthur O'Dwyer via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Sun Mar 6 11:50:17 PST 2022
Quuxplusone requested changes to this revision.
Quuxplusone added a comment.
This revision now requires changes to proceed.
Thanks for working on this! This will be super useful — not really to working programmers ;) but in terms of checking boxes for C++17 conformance and finally getting to remove the experimental versions!
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:34-39
+template <class _Key,
+ class _Value,
+ class _Hash,
+ class _BinaryPredicate,
+ bool /*useArray*/>
+class _BMSkipTable;
----------------
Personally I'd prefer to see the first specialization here done as the primary template, and then we just have one specialization for `_BMSkipTable<..., false>`. I see it's less symmetric, but it eliminates a forward declaration (and ~7 LoC).
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:56
+ _LIBCPP_HIDE_FROM_ABI
+ _BMSkipTable(size_t __sz, value_type __default_value, _Hash __hash, _BinaryPredicate __pred)
+ : __default_value_(__default_value),
----------------
Likewise throughout. "All your ctors should be explicit unless they're used for implicit conversions" (or unless the Standard API mandates it :()
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:86
+ _LIBCPP_HIDE_FROM_ABI _BMSkipTable(size_t, value_type __default_value, _Hash, _BinaryPredicate) {
+ std::fill_n(__table_.begin(), __table_.size(), __default_value);
+ }
----------------
`#include <__algorithm/fill_n.h>`
Also, please prefer `__table_.data()` here. `begin+end` and `data+size` always go together as pairs, respectively. (It's also slightly cheaper to compile `data` than `begin` because raw pointers, but that's not the reason it raised the red flag for me.)
Actually, whoa, I thought this was our `<vector>` dependency, but it's not! This specialization should just use a plain old C array `value_type __table_[256];` and cut out all the metaprogramming. This header shouldn't need to depend on `<array>` at all.
You might `static_assert(numeric_limits<value_type>::max() < 256)` just to be on the absolutely safe side.
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:122
+ __pred_(__pred),
+ __pattern_length_(std::distance(__first, __last)),
+ __skip_table_(make_shared<__skip_table_type>(__pattern_length_, -1, __hash, __pred_)),
----------------
Let's use `__last - __first` and save some instantiations. These are random-access iterators.
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:124
+ __skip_table_(make_shared<__skip_table_type>(__pattern_length_, -1, __hash, __pred_)),
+ __suffix_(make_shared<vector<difference_type>>(__pattern_length_ + 1)) {
+ difference_type __i = 0;
----------------
First reaction: Why depend on shared_ptr? Second reaction: Why two dereferences (one for the shared_ptr and another for the vector)? At the very least this should use `make_shared<difference_type[]>(__pattern_length_ + 1)`.
But it would be awesome if we could eliminate the shared_ptr too, somehow. I assume it's in there because std::search wants to copy these things but we don't want the copy to be expensive.
Could we just make std::search //not// copy these things? And then it wouldn't matter if the copy was expensive, and we could just use `std::vector<D>` instead of `std::shared_ptr<D[]>`. (The copy ctor still needs to exist, IIUC, but it doesn't have to be fast, if we can claim it's the user's fault if they ever call it.)
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:186
+ void __compute_bm_prefix(_Iterator __first, _Iterator __last, _BinaryPredicate __pred, _Container& __prefix) {
+ const size_t __count = std::distance(__first, __last);
+
----------------
As above, `__last - __first` — this is instantiated only with `RAI` and `reverse_iterator<RAI>`, both of which are random-access.
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:192
+ for (size_t __i = 1; __i != __count; ++__i) {
+ while(__k > 0 && !__pred(__first[__k], __first[__i]))
+ __k = __prefix[__k - 1];
----------------
================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:205-206
+
+ if (__count <= 0)
+ return;
+
----------------
Nit: Should be `if (__first == __last)` or `if (__count == 0)`, I'd think. size_t can't be negative.
================
Comment at: libcxx/include/optional:165-167
+#include <__functional/hash.h>
+#include <__functional/invoke.h>
+#include <__memory/construct_at.h>
----------------
This, and the diff in the <optional> test, looks unrelated to boyer_moore_searcher. Please do this in a separate PR (which I'll rubberstamp if CI is green).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D121074/new/
https://reviews.llvm.org/D121074
More information about the libcxx-commits
mailing list