[PATCH] D91837: ADT: Fix reference invalidation in SmallVector APIs that pass in a value

Duncan P. N. Exon Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 22 18:52:38 PST 2020


dexonsmith added a comment.

Thanks for the review!

A few responses inline; I'll try to update the patch tomorrow.

Regarding number of patches, this is already split into multiple patches in my local tree:

1. Fix invalidation for `push_back` and `insert` (one), the latter of which sometimes calls `push_back`.
2. Fix invalidation for `append` and `insert` (N), the latter of which sometimes calls `append`.
3. Fix invalidation for `resize`.

(The various helpers get added as needed.)

Happy to split it now that the performance has been validated. I'll probably find time to rebase and post tomorrow -- let me know if there's something else you'd like split out.

In D91837#2468608 <https://reviews.llvm.org/D91837#2468608>, @dblaikie wrote:

> Oh, one other question: How's all this implementation strategy compared to std::vector? Be nice to be relying on prior art where possible.

Libc++'s `std::vector` uses a more general, split buffer approach that fixes all the invalidation, quite similar to https://reviews.llvm.org/D87326. Unfortunately that was reverted due to compile-time and code size regressions. My thinking is that we can try to land https://reviews.llvm.org/D87326 again after this patch (series) to fix the remaining invalidation (and that the regressions may be less bad / acceptable since the common cases in this patch are handled in a more lightweight manner).



================
Comment at: llvm/include/llvm/ADT/SmallVector.h:141
+  bool isReferenceToStorage(const void *V) const {
+    return V >= this->begin() && V < this->end();
+  }
----------------
dblaikie wrote:
> Absurd C++ pedantry, but I believe op< on pointers to unrelated objects is unspecified in C++? ("If two pointers are not specified to compare greater or compare equal, the result of the comparison is unspecified. An unspecified result may be nondeterministic, and need not be consistent even for multiple evaluations of the same expression with the same operands in the same execution of the program" - https://en.cppreference.com/w/cpp/language/operator_comparison) but std::less is specified to handle this case: "A specialization of std::less for any pointer type yields a strict total order, even if the built-in operator< does not." - https://en.cppreference.com/w/cpp/utility/functional/less)
> 
> Because C++ is fun like that.
Hah; I wondered about this, but couldn't find it in the standard, so ended up leaving it as it was in the prep patch. I'll fix in a prep commit.


================
Comment at: llvm/include/llvm/ADT/SmallVector.h:334
+      ReferencesStorage = true;
+      Index = &Elt - this->begin();
+    }
----------------
dblaikie wrote:
> Is there any case where 'Elt' could be contained within a subobject of an element of the SmallVector, while not being an element itself? I guess not, because T can't contain (directly) another T - but asking just in case I've missed a possibility (if this were true, then an index would be inadequate for persisting 'Elt' across the grow operation).
Correct, not possible. As a result this code wouldn't handle `V.emplace_back(subobject_reference)`, but it's not used for that at this point. (We could probably handle `emplace_back` somehow with this technique, but I'm not sure it would still be lightweight / worth it.)


================
Comment at: llvm/include/llvm/ADT/SmallVector.h:454-467
+  T *reserveForAndGetAddress(const T &Elt, size_t N = 1) {
+    size_t NewSize = this->size() + N;
+    if (LLVM_LIKELY(NewSize <= this->capacity()))
+      return const_cast<T *>(&Elt);
+
+    bool ReferencesStorage = false;
+    int64_t Index = -1;
----------------
dblaikie wrote:
> Is this text identical between the trivially copyable and non-trivially copyable specializations? If so, can it be shared at all? I guess it'd need to be in a CRTP base placed between these specializations and SmallVectorTemplateCommon, and possibly not worth the added complexity?
Yes, it's identical. I can take a look to see how bad it is with CRTP.


================
Comment at: llvm/include/llvm/ADT/SmallVector.h:677-686
+  template <
+      class ArgType,
+      std::enable_if_t<
+          std::is_same<std::remove_const_t<std::remove_reference_t<ArgType>>,
+                       T>::value &&
+              !TakesParamByValue,
+          bool> = false>
----------------
dblaikie wrote:
> Perhaps all this infrastructure and SFINAE might merit a separate patch? (breaking this patch up into separate changes for adding reference validity for each independent operation might be nice in general, but especially this part/set of changes, if they can be separate - it'd also make some of the other changes a bit clearer, once the reserveForAndGetAddress is added, subsequent changes would be "port to use that, and add test")
> 
> But I know that can be a hassle to make, manage the reviews, etc, so not vital by any means.
Very happy to split this up; just wanted to start with the holistic patch to get numbers (it's already split up locally).


================
Comment at: llvm/unittests/ADT/SmallVectorTest.cpp:1083-1085
+  // Append enough more elements that V will grow again. This time the old
+  // storage will have been freed at time of access and sanitizers can catch
+  // the use-after-free.
----------------
dblaikie wrote:
> By this you mean "sanitizers would catch use-after-free if this support were missing/broken"? Otherwise this reads as though it might be a sanitizer-only death test, specifically checking that misuse fails/is diagnosed, which it isn't.
Correct; not a death test, but I expect the sanitizers to catch the problem if the implementation is wrong.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91837/new/

https://reviews.llvm.org/D91837



More information about the llvm-commits mailing list