[libcxx-commits] [PATCH] D121074: [libc++] Implement std::boyer_moore{, _horspool}_searcher

Nikolas Klauser via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Apr 21 04:59:59 PDT 2022


philnik marked 8 inline comments as done.
philnik added inline comments.


================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:34-39
+template <class _Key,
+          class _Value,
+          class _Hash,
+          class _BinaryPredicate,
+          bool /*useArray*/>
+class _BMSkipTable;
----------------
Quuxplusone wrote:
> Personally I'd prefer to see the first specialization here done as the primary template, and then we just have one specialization for `_BMSkipTable<..., false>`. I see it's less symmetric, but it eliminates a forward declaration (and ~7 LoC).
Is that purely style or is there any technical reason to change it?


================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:86
+  _LIBCPP_HIDE_FROM_ABI _BMSkipTable(size_t, value_type __default_value, _Hash, _BinaryPredicate) {
+    std::fill_n(__table_.begin(), __table_.size(), __default_value);
+  }
----------------
Quuxplusone wrote:
> `#include <__algorithm/fill_n.h>`
> Also, please prefer `__table_.data()` here. `begin+end` and `data+size` always go together as pairs, respectively. (It's also slightly cheaper to compile `data` than `begin` because raw pointers, but that's not the reason it raised the red flag for me.)
> 
> Actually, whoa, I thought this was our `<vector>` dependency, but it's not! This specialization should just use a plain old C array `value_type __table_[256];` and cut out all the metaprogramming. This header shouldn't need to depend on `<array>` at all.
> You might `static_assert(numeric_limits<value_type>::max() < 256)` just to be on the absolutely safe side.
I think it's easier to read when using an `array`.


================
Comment at: libcxx/include/__functional/boyer_moore_searcher.h:124
+      __skip_table_(make_shared<__skip_table_type>(__pattern_length_, -1, __hash, __pred_)),
+      __suffix_(make_shared<vector<difference_type>>(__pattern_length_ + 1)) {
+    difference_type __i = 0;
----------------
Quuxplusone wrote:
> First reaction: Why depend on shared_ptr? Second reaction: Why two dereferences (one for the shared_ptr and another for the vector)? At the very least this should use `make_shared<difference_type[]>(__pattern_length_ + 1)`.
> 
> But it would be awesome if we could eliminate the shared_ptr too, somehow. I assume it's in there because std::search wants to copy these things but we don't want the copy to be expensive.
> 
> Could we just make std::search //not// copy these things? And then it wouldn't matter if the copy was expensive, and we could just use `std::vector<D>` instead of `std::shared_ptr<D[]>`. (The copy ctor still needs to exist, IIUC, but it doesn't have to be fast, if we can claim it's the user's fault if they ever call it.)
`std::search` doesn't copy this. It just gets a const ref and returns the search result. I'm not sure why you wouldn't just call the search directly, but hey. I'd argue that using `std::shared_ptr<D[]>` might be the better alternative, since after construction the elements will never change. And AFAIK it would be cheaper to copy and as cheap to move.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121074/new/

https://reviews.llvm.org/D121074



More information about the libcxx-commits mailing list