SmallVector::insert() fix

David Blaikie dblaikie at gmail.com
Fri Jun 7 10:11:02 PDT 2013


On Fri, Jun 7, 2013 at 10:06 AM, Howard Hinnant <hhinnant at apple.com> wrote:
> At what point does it make more sense to just use std::vector with a stack-based allocator?  You get the same speed, and the only bugs you have to worry about are in the allocator.  std::vector is pounded on by a far bigger audience than llvm::SmallVector is.

I'm generally in favor of this - hadn't really occurred to me until
you said it, though.

>  Using C++11 template aliases you could even implement llvm::SmallVector<T, N> as a std::vector<T, llvm::stack_allocator<T, N>>

This would be slightly problematic - is there anything we could do for
C++98 compatibility that would keep the terse naming we already have?

>
> Fwiw, David's 4 step recipe below is correct and is what the libc++ std::vector does.  To do this in an exception-safe manner (I know, llvm doesn't care about exceptions), the libc++ uses another data structure I call __split_buffer which allows the data to start in the middle of the buffer.  And when the data starts at the beginning of a __split_buffer, vector and __split_buffer can transfer data buffers among one another.  I.e. (1) constructs a __split_buffer with the proper capacity, (2) copies into the middle of the __split_buffer, (4) copies beyond this point into the __split_buffer, and when (3) finishes, the __split_buffer has data starting at the beginning of its buffer.
>
> Howard
>
> On Jun 7, 2013, at 12:46 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> Bumping again - since someone just worked around this bug in a
>> SmallVector usage & I filed PR16253 which this patch aims to address.
>>
>> Off-hand it seems like recording the pointer index, growing the store,
>> then copying the element isn't the ideal solution (for one, it doesn't
>> work for indirect situations - imagine, say, a non-move-aware tree
>> structure & you have a SmallVector of tree nodes you are inserting a
>> lower branch into a higher one - you may still invalidate the branch
>> you're on even though it's not a direct member of the SmallVector
>> you're inserting into). What we should be doing is breaking out the
>> resize operation into smaller functions ((optionally) allocate the
>> necessary space, copy elements in a range, etc) and using those to
>> implement:
>>
>> 1) reserve space, but don't copy or invalidate the old buffer
>> 2) copy the new element into place
>> 3) copy (or move) the elements preceding the insertion point
>> 4) copy (or move) the elements proceeding the insertion point
>>
>> On Mon, Apr 29, 2013 at 8:30 AM, Jean-Luc Duprat <jduprat at apple.com> wrote:
>>> Thanks David.
>>>
>>> JL
>>>
>>>
>>>
>>> On Apr 28, 2013, at 07:58 , David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>> Bumping this for me or someone to look at.
>>>
>>> On Apr 3, 2013 1:42 AM, "Jean-Luc Duprat" <jduprat at apple.com> wrote:
>>>>
>>>> I realize I may not have been very clear in proposing this patches for
>>>> review by tacking on to a previous conversation, so let me start again.
>>>> I am leaving the details of the relevant previous llvm-commit conversation
>>>> below.
>>>>
>>>> iterator SmallVectorImpl::insert(iterator I, const T &Elt) and friends
>>>> face two problems when the underlying storage needs to be re-allocated:
>>>> 1- The iterator which points into that storage needs to remain valid
>>>> through the operation
>>>> 2 - If the element to be inserted in the vector is already in the vector,
>>>> say v.insert(v.begin(), v[3]), then the reference needs to remain valid
>>>> across the grow() operation.
>>>>
>>>> Case 1 was detected and fixed in July 2011.  I just ran into case 2,
>>>> triggered by a unit test.
>>>> The problem has always been there, but I was playing with non-standard
>>>> memory allocations that brought this to light.
>>>> We're just lucky most of the time…
>>>>
>>>> The conversation I was referring to below between Howard Hinnant and Dan
>>>> Gohman concludes that std::vector need to specifically handle this
>>>> situation.
>>>> I see no reason why SmallVector would have different semantics here.
>>>>
>>>> The patch attached ensures that the above use case of
>>>> SmallVector::insert() works as expected, without relying on luck.
>>>> Please review this patch and provide feedback.
>>>>
>>>>
>>>>
>>>>
>>>> JL
>>>>
>>>>
>>>>
>>>> On Apr 1, 2013, at 23:01 , Jean-Luc Duprat <jduprat at apple.com> wrote:
>>>>
>>>>> This is an updated patch that fix SmallVector::insert() in all its
>>>>> implementations.
>>>>> This bug was being triggered for some allocation sizes on Linux by the
>>>>> test suite, although not for the default allocs in the LLVM tree.
>>>>> The problem is real, it was just not very visible…
>>>>>
>>>>> Thank you for your feedback,
>>>>>
>>>>> JL
>>>>>
>>>>>
>>>>>
>>>>> <smv.patch>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Apr 1, 2013, at 11:27 , Jean-Luc Duprat <jduprat at apple.com> wrote:
>>>>>
>>>>>> I just ran into a similar issue with LLVM's SmallVector in case that
>>>>>> was very similar to the one discussed in the standard library.
>>>>>> I light of Howard's comment, SmallVector should preserve the validity
>>>>>> of both iterators/reference in this case:
>>>>>> v.insert(v.begin(), v[3]);
>>>>>> if it is to behave similarly to std::vector.
>>>>>>
>>>>>> I propose the attached fix to SmallVector.
>>>>>>
>>>>>> <SmallVector.patch>
>>>>>>
>>>>>>
>>>>>> JL
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 27, 2013, at 13:47 , Howard Hinnant <hhinnant at apple.com> wrote:
>>>>>>
>>>>>>> On Mar 27, 2013, at 4:34 PM, Dan Gohman <dan433584 at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 27, 2013 at 12:16 PM, Howard Hinnant <hhinnant at apple.com>
>>>>>>>> wrote:
>>>>>>>> On Mar 27, 2013, at 2:44 PM, Dan Gohman <dan433584 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Author: djg
>>>>>>>>> Date: Wed Mar 27 13:44:56 2013
>>>>>>>>> New Revision: 178166
>>>>>>>>>
>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=178166&view=rev
>>>>>>>>> Log:
>>>>>>>>> Avoid undefined behavior from passing a std::vector's own contents
>>>>>>>>> in as an argument to push_back.
>>>>>>>>
>>>>>>>> The original code is not undefined behavior.
>>>>>>>>
>>>>>>>> My interpretation was that the push_back could cause reallocation
>>>>>>>> which could invalidate the reference to the old element before it is read.
>>>>>>>> Is there a guarantee that this won't happen?
>>>>>>>
>>>>>>> There is not an allowance for the vendor to let it happen.  Such an
>>>>>>> allowance would look like the one for insert(p, i, j):
>>>>>>>
>>>>>>>> pre: i and j are not iterators into a.
>>>>>>>
>>>>>>> In practice, the new buffer has to be allocated and populated before
>>>>>>> the old buffer is deallocated, so there is no hardship on the vendor anyway.
>>>>>>>
>>>>>>> The vendor does have to be careful with vector::insert(iterator, const
>>>>>>> T&), as self-referencing can be visible there.  But again in that case the
>>>>>>> standard puts the onus on the vendor, not on the client to just make things
>>>>>>> work.
>>>>>>>
>>>>>>> Note that if you had moved from the source that would be another
>>>>>>> matter as 17.6.4.9 [res.on.arguments]/p1/b3 says:
>>>>>>>
>>>>>>>> f a function argument binds to an rvalue reference parameter, the
>>>>>>>> implementation may assume that this parameter is a unique reference to this
>>>>>>>> argument.
>>>>>>>
>>>>>>> So, for example:
>>>>>>>
>>>>>>> v.insert(v.begin(), std::move(v[3]));
>>>>>>>
>>>>>>> would be asking for trouble, whereas
>>>>>>>
>>>>>>> v.insert(v.begin(), v[3]);
>>>>>>>
>>>>>>> is ok.
>>>>>>>
>>>>>>> Howard
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>




More information about the llvm-commits mailing list