SmallVector::insert() fix

Howard Hinnant hhinnant at apple.com
Fri Jun 7 10:06:06 PDT 2013


At what point does it make more sense to just use std::vector with a stack-based allocator?  You get the same speed, and the only bugs you have to worry about are in the allocator.  std::vector is pounded on by a far bigger audience than llvm::SmallVector is.  Using C++11 template aliases you could even implement llvm::SmallVector<T, N> as a std::vector<T, llvm::stack_allocator<T, N>>

Fwiw, David's 4 step recipe below is correct and is what the libc++ std::vector does.  To do this in an exception-safe manner (I know, llvm doesn't care about exceptions), the libc++ uses another data structure I call __split_buffer which allows the data to start in the middle of the buffer.  And when the data starts at the beginning of a __split_buffer, vector and __split_buffer can transfer data buffers among one another.  I.e. (1) constructs a __split_buffer with the proper capacity, (2) copies into the middle of the __split_buffer, (4) copies beyond this point into the __split_buffer, and when (3) finishes, the __split_buffer has data starting at the beginning of its buffer.

Howard

On Jun 7, 2013, at 12:46 PM, David Blaikie <dblaikie at gmail.com> wrote:

> Bumping again - since someone just worked around this bug in a
> SmallVector usage & I filed PR16253 which this patch aims to address.
> 
> Off-hand it seems like recording the pointer index, growing the store,
> then copying the element isn't the ideal solution (for one, it doesn't
> work for indirect situations - imagine, say, a non-move-aware tree
> structure & you have a SmallVector of tree nodes you are inserting a
> lower branch into a higher one - you may still invalidate the branch
> you're on even though it's not a direct member of the SmallVector
> you're inserting into). What we should be doing is breaking out the
> resize operation into smaller functions ((optionally) allocate the
> necessary space, copy elements in a range, etc) and using those to
> implement:
> 
> 1) reserve space, but don't copy or invalidate the old buffer
> 2) copy the new element into place
> 3) copy (or move) the elements preceding the insertion point
> 4) copy (or move) the elements proceeding the insertion point
> 
> On Mon, Apr 29, 2013 at 8:30 AM, Jean-Luc Duprat <jduprat at apple.com> wrote:
>> Thanks David.
>> 
>> JL
>> 
>> 
>> 
>> On Apr 28, 2013, at 07:58 , David Blaikie <dblaikie at gmail.com> wrote:
>> 
>> Bumping this for me or someone to look at.
>> 
>> On Apr 3, 2013 1:42 AM, "Jean-Luc Duprat" <jduprat at apple.com> wrote:
>>> 
>>> I realize I may not have been very clear in proposing this patches for
>>> review by tacking on to a previous conversation, so let me start again.
>>> I am leaving the details of the relevant previous llvm-commit conversation
>>> below.
>>> 
>>> iterator SmallVectorImpl::insert(iterator I, const T &Elt) and friends
>>> face two problems when the underlying storage needs to be re-allocated:
>>> 1- The iterator which points into that storage needs to remain valid
>>> through the operation
>>> 2 - If the element to be inserted in the vector is already in the vector,
>>> say v.insert(v.begin(), v[3]), then the reference needs to remain valid
>>> across the grow() operation.
>>> 
>>> Case 1 was detected and fixed in July 2011.  I just ran into case 2,
>>> triggered by a unit test.
>>> The problem has always been there, but I was playing with non-standard
>>> memory allocations that brought this to light.
>>> We're just lucky most of the time…
>>> 
>>> The conversation I was referring to below between Howard Hinnant and Dan
>>> Gohman concludes that std::vector need to specifically handle this
>>> situation.
>>> I see no reason why SmallVector would have different semantics here.
>>> 
>>> The patch attached ensures that the above use case of
>>> SmallVector::insert() works as expected, without relying on luck.
>>> Please review this patch and provide feedback.
>>> 
>>> 
>>> 
>>> 
>>> JL
>>> 
>>> 
>>> 
>>> On Apr 1, 2013, at 23:01 , Jean-Luc Duprat <jduprat at apple.com> wrote:
>>> 
>>>> This is an updated patch that fix SmallVector::insert() in all its
>>>> implementations.
>>>> This bug was being triggered for some allocation sizes on Linux by the
>>>> test suite, although not for the default allocs in the LLVM tree.
>>>> The problem is real, it was just not very visible…
>>>> 
>>>> Thank you for your feedback,
>>>> 
>>>> JL
>>>> 
>>>> 
>>>> 
>>>> <smv.patch>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Apr 1, 2013, at 11:27 , Jean-Luc Duprat <jduprat at apple.com> wrote:
>>>> 
>>>>> I just ran into a similar issue with LLVM's SmallVector in case that
>>>>> was very similar to the one discussed in the standard library.
>>>>> I light of Howard's comment, SmallVector should preserve the validity
>>>>> of both iterators/reference in this case:
>>>>> v.insert(v.begin(), v[3]);
>>>>> if it is to behave similarly to std::vector.
>>>>> 
>>>>> I propose the attached fix to SmallVector.
>>>>> 
>>>>> <SmallVector.patch>
>>>>> 
>>>>> 
>>>>> JL
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mar 27, 2013, at 13:47 , Howard Hinnant <hhinnant at apple.com> wrote:
>>>>> 
>>>>>> On Mar 27, 2013, at 4:34 PM, Dan Gohman <dan433584 at gmail.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Mar 27, 2013 at 12:16 PM, Howard Hinnant <hhinnant at apple.com>
>>>>>>> wrote:
>>>>>>> On Mar 27, 2013, at 2:44 PM, Dan Gohman <dan433584 at gmail.com> wrote:
>>>>>>> 
>>>>>>>> Author: djg
>>>>>>>> Date: Wed Mar 27 13:44:56 2013
>>>>>>>> New Revision: 178166
>>>>>>>> 
>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=178166&view=rev
>>>>>>>> Log:
>>>>>>>> Avoid undefined behavior from passing a std::vector's own contents
>>>>>>>> in as an argument to push_back.
>>>>>>> 
>>>>>>> The original code is not undefined behavior.
>>>>>>> 
>>>>>>> My interpretation was that the push_back could cause reallocation
>>>>>>> which could invalidate the reference to the old element before it is read.
>>>>>>> Is there a guarantee that this won't happen?
>>>>>> 
>>>>>> There is not an allowance for the vendor to let it happen.  Such an
>>>>>> allowance would look like the one for insert(p, i, j):
>>>>>> 
>>>>>>> pre: i and j are not iterators into a.
>>>>>> 
>>>>>> In practice, the new buffer has to be allocated and populated before
>>>>>> the old buffer is deallocated, so there is no hardship on the vendor anyway.
>>>>>> 
>>>>>> The vendor does have to be careful with vector::insert(iterator, const
>>>>>> T&), as self-referencing can be visible there.  But again in that case the
>>>>>> standard puts the onus on the vendor, not on the client to just make things
>>>>>> work.
>>>>>> 
>>>>>> Note that if you had moved from the source that would be another
>>>>>> matter as 17.6.4.9 [res.on.arguments]/p1/b3 says:
>>>>>> 
>>>>>>> f a function argument binds to an rvalue reference parameter, the
>>>>>>> implementation may assume that this parameter is a unique reference to this
>>>>>>> argument.
>>>>>> 
>>>>>> So, for example:
>>>>>> 
>>>>>> v.insert(v.begin(), std::move(v[3]));
>>>>>> 
>>>>>> would be asking for trouble, whereas
>>>>>> 
>>>>>> v.insert(v.begin(), v[3]);
>>>>>> 
>>>>>> is ok.
>>>>>> 
>>>>>> Howard
>>>>>> 
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list