[libcxx-commits] [PATCH] D121074: [libc++] Implement std::boyer_moore{, _horspool}_searcher

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Apr 21 04:59:59 PDT 2022

philnik marked 8 inline comments as done.
philnik added inline comments.

Comment at: libcxx/include/__functional/boyer_moore_searcher.h:34-39
+template <class _Key,
+          class _Value,
+          class _Hash,
+          class _BinaryPredicate,
+          bool /*useArray*/>
+class _BMSkipTable;
Quuxplusone wrote:
> Personally I'd prefer to see the first specialization here done as the primary template, and then we just have one specialization for `_BMSkipTable<..., false>`. I see it's less symmetric, but it eliminates a forward declaration (and ~7 LoC).
Is that purely style or is there any technical reason to change it?

Comment at: libcxx/include/__functional/boyer_moore_searcher.h:86
+  _LIBCPP_HIDE_FROM_ABI _BMSkipTable(size_t, value_type __default_value, _Hash, _BinaryPredicate) {
+    std::fill_n(__table_.begin(), __table_.size(), __default_value);
+  }
Quuxplusone wrote:
> `#include <__algorithm/fill_n.h>`
> Also, please prefer `__table_.data()` here. `begin+end` and `data+size` always go together as pairs, respectively. (It's also slightly cheaper to compile `data` than `begin` because raw pointers, but that's not the reason it raised the red flag for me.)
> Actually, whoa, I thought this was our `<vector>` dependency, but it's not! This specialization should just use a plain old C array `value_type __table_[256];` and cut out all the metaprogramming. This header shouldn't need to depend on `<array>` at all.
> You might `static_assert(numeric_limits<value_type>::max() < 256)` just to be on the absolutely safe side.
I think it's easier to read when using an `array`.

Comment at: libcxx/include/__functional/boyer_moore_searcher.h:124
+      __skip_table_(make_shared<__skip_table_type>(__pattern_length_, -1, __hash, __pred_)),
+      __suffix_(make_shared<vector<difference_type>>(__pattern_length_ + 1)) {
+    difference_type __i = 0;
Quuxplusone wrote:
> First reaction: Why depend on shared_ptr? Second reaction: Why two dereferences (one for the shared_ptr and another for the vector)? At the very least this should use `make_shared<difference_type[]>(__pattern_length_ + 1)`.
> But it would be awesome if we could eliminate the shared_ptr too, somehow. I assume it's in there because std::search wants to copy these things but we don't want the copy to be expensive.
> Could we just make std::search //not// copy these things? And then it wouldn't matter if the copy was expensive, and we could just use `std::vector<D>` instead of `std::shared_ptr<D[]>`. (The copy ctor still needs to exist, IIUC, but it doesn't have to be fast, if we can claim it's the user's fault if they ever call it.)
`std::search` doesn't copy this. It just gets a const ref and returns the search result. I'm not sure why you wouldn't just call the search directly, but hey. I'd argue that using `std::shared_ptr<D[]>` might be the better alternative, since after construction the elements will never change. And AFAIK it would be cheaper to copy and as cheap to move.

  rG LLVM Github Monorepo



More information about the libcxx-commits mailing list