[PATCH] D87326: [ADT] Fix reference invalidation when self referencing a SmallVector

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Sep 12 17:06:38 PDT 2020


dblaikie added a comment.

In D87326#2269455 <https://reviews.llvm.org/D87326#2269455>, @njames93 wrote:

> In D87326#2269286 <https://reviews.llvm.org/D87326#2269286>, @dblaikie wrote:
>
>> Hmm - I think maybe what I had trouble with was understanding how much this shuold generalize. Should this handle subranges? (if you try to append half of a vector onto itself) and if so, how? Anyone happen to know what's guaranteed by the standard (perhaps only push_back of single elements - that's easier to handle)
>
> From what I can see insert and append fully support when the range to insert is enclosed by the vector, however assign makes no such promise as it would potentially require allocating more storage even if the container itself doesn't need to grow

Makes sense.

>>> I haven't added specific test cases for this, but all ADT Tests did run without a hitch under asan
>>
>> Did they run without a hitch under asan before the change too? I imagine so.
>
> They did but that was only because the test cases don't grow multiple times from what I can see

Hmm - I thought this was about self-referential inserts/push_back, etc? Was/is there also bugs related to multiple growth? Could you show a small test case that'd fail asan with the current in-tree code related to multiple growth?

>> So probably worth adding specific tests for overlapping - ones that would've failed under asan before this change, but pass with it.
>>
>> The test updates you have done in the patch so far - they're intended to make the test coverage a bit more robust to have more confidence over the refactoring? Could those test changes be committed first/separately - I assume they're intended to pass before and after the patch? So might be good as a preliminary/separate patch before this one.
>
> The code to make the tests more robust fail without the modifications to small vector as they expose issues for insert moving an already deleted item.

Ah, I see now the new code avoids extra moves - the old code would grow the container, then insert the elements - which could cause an extra shift of elements after the insertion point to then make space for the to-be-inserted elements. Great to have that fixed too, but I think there should be at least some test coverage for the self-insertion otherwise we could keep the efficiency gains but accidentally break the self-insertion(self-push back, etc) by doing the same operations, but in the wrong order (eg: moving over to the new buffer first, then inserting the new elements, rather than new elements first, then old elements).

In D87326#2269796 <https://reviews.llvm.org/D87326#2269796>, @MaskRay wrote:

> Yeah, seems that SmallVector::push_back should be fixed to allow `a.push_back(a[0])`. SmallVector::insert has been fixed in rL134554 <https://reviews.llvm.org/rL134554>

Ah, right, sort of a related but maybe slightly different problem - when not resizing the underlying buffer, elements are moved, then the new one is inserted so the reference may've been invalidated. Probably best to handle that in a separate/follow-up commit - any idea if the C++ standard guarantees this for std::vector, for instance? And if so, how does libc++ implement this guarantee?



================
Comment at: llvm/include/llvm/ADT/SmallVector.h:715-717
+      for (auto I = this->begin(), E = this->end(); I != E; ++I, ++in_start) {
+        *I = *in_start;
+      }
----------------
Could this use a range-based for loop? (I realize that'd mean moving the increment into the loop - so totally OK if you reckon that'd be less readable)

Can also drop the {} on single-line constructs like this. (& the if == size above)


================
Comment at: llvm/unittests/ADT/SmallVectorTest.cpp:35
   static int numCopyAssignmentCalls;
+  enum ObjectState { OS_Invalid = 0, OS_Constructed, OS_MovedOut };
 
----------------
Maybe `Destroyed` rather than `Invalid` (& I think `MovedFrom` is maybe the more common phrasing than `MovedOut`)

Could use an enum class so you don't need to have the `OS_` prefix, instead `ObjectState::Constructed`, etc?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87326/new/

https://reviews.llvm.org/D87326



More information about the llvm-commits mailing list