[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.

Tue Dec 1 14:33:07 PST 2020

I feel that this thread has three different discussions going on, which should be separated out:

1) Should we forbid std::vector and use SmallVector<T,0> always?
2) What should the default argument of N be?
3) What should these types be called?

Unfortunately, these are all interrelated.

> On Nov 30, 2020, at 6:04 PM, Sean Silva <chisophugis at gmail.com> wrote:
> I'm okay with either 1 or 0 inlined elements for huge value types.
> 
> I can buy the argument both ways. It basically comes down to the invariants we want to provide:
> 
> - invariant for "minimum 0 inlined elements" is "sizeof(SmallVector<T>) <= 64"
>   - this in theory provides stronger invariants related to excessive memory usage from the inlined elements
> - invariant for "minimum 1 inlined elements" is "at least one inlined element, possibly more if we can fit it in 64 bytes"
>   - this avoids confusion of SmallVector<T> not having inlined elements

I think there are two different consistent design points depending on naming:

1) In a design like today where we have two names for “inlined vector” and “not inlined vector”, then I think it is important for “InlinedVector<T>” to have at least 1 element.  Defaulting to out of line when using the inline vector name betrays intention: it would be better to generate a compile time error or warning if the element size is too long, because the compiler engineer should use the out of line name.

2) In a design where we have "one name", then it makes more sense to have the default argument type be the “do what I mean” size, which defaults to zero if T is too large.

I’m pretty skeptical of #2 as an approach, because inline and out of line design points are pretty different when sizeof(T) is 4 or 8, and when sizeof(thevector) matters, e.g. in 2D cases.

> I don't think either choice really boxes us out of any future change or even matters much. So why don't we all rally around the 1 inlined element minimum case? Given the name "SmallVector" (which is a bad name, but folks will read it as "vector with inlined elements") it seems least surprising for now.
> 
> And in practice huge value types are very rare, so honestly I don't think that the invariant provided by "minimum 0" really buys us much in practice when viewed holistically. (also, both cases prevent SmallVector<SmallVector<T>> from multiplicatively exploding in size which makes me happy).
> 
> I actually was mildly leaning to the "minimum 0" side, but after writing the above I'm now leaning towards "minimum 1".

Yes, I agree with this.

> There's a larger discussion to be had here which I don't want to block this change on: Some folks (including me) believe that SmallVector has drifted from "used for small number of elements" and is in practice more like "LLVM's better vector" (including using 0 inline elements to replace std::vector) and the latter interpretation makes the "minimum 0" case make more sense. But we haven't codified the "use SmallVector / a new llvm::Vector unless you know you need std::vector" policy, and once we do then switching to the "minimum 0" case is pretty trivial. I'm very happy to have that discussion, but it seems incongruous to block the "default N" change on having that discussion.

I have found this discussion to be useful and interesting, I hadn’t thought about it this much at least.

Overall, I’d recommend a path like this:

1) We decide what to do about the default argument.  I agree with Sean that SmallVector<T> should default to 1 at the minimum, and produce an error or warning of T.  This makes sense given the bias towards a name that implies inline semantics.

2) We decide whether we want to ban std::vector in the LLVM code base by convention.  If so, I think that we should have a *separate* name for the out of line case, e.g. llvm::Vector<T>, which would be a good dual to llvm::SmallVector<T>.  If this is the end point (Duncan seems to think it should be) then we should make this part of the coding standard and move the code base towards it over time.

3) Finally, we decide if we want to rename SmallVector to something like InlineVector or IVector.

-Chris

> 
> -- Sean Silva
> 
> On Fri, Nov 27, 2020 at 8:45 PM Chris Lattner <clattner at nondot.org <mailto:clattner at nondot.org>> wrote:
> Sorry for falling off the map on this thread:
> 
> On Nov 17, 2020, at 1:42 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>> Thoughts/suggestions:
>>> - Adding the default seems very reasonable to me, and I think that 64 bytes is a good default.  I think you should change the behavior so that SmallVector<LargeThing> defaults to a single inline element instead of zero though.  Perhaps generate a static_assert when it is crazy large.
>> 
>> Out of curiosity: Why a single rather than zero?
> 
> My rationale for this is basically that SmallVector is typically used for the case when you want to avoid an out-of-line allocation for a small number of elements, this was the reason it was created.  While there is some performance benefits of SmallVector<T,0> over std::vector<> they are almost trivial.  I don’t see why we’d recommend SmallVector<T,0> over vector to get those.  If we were in favor of banning std::vector, then I think the reason would be for consistency across the codebase and to get things like pop_back_val().
> 
> One other way to handle this is to make there be a static_assert for the case where the size is inferred to zero.
> 
>> On Nov 25, 2020, at 10:55 PM, Michael Kruse via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> Am Mi., 25. Nov. 2020 um 17:52 Uhr schrieb David Blaikie via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>:
>>> I'm still not sure why "at least 1" is an important goal. I'd be
>>> curious to hear from Chris/others why that's especially valuable.
>> 
>> If N is 0, then it's not a small-size optimized vector, in contrast to
>> what the name implies.
> 
> 
> Right, exactly.
> 
>> On Nov 17, 2020, at 2:15 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com <mailto:dexonsmith at apple.com>> wrote:
>> I'd be a bit concerned about this for two reasons:
>> 
>> - Adding a large inlined element by-default may reinforce one of the problems with SmallVector, that people accidentally put big vectors in places that shouldn't have them. If we're going to make accidentally large inlined vectors more ergonomic, I think we ought to make 0-element versions more ergonomic too.
> 
> I don’t understand your rationale here: the issue in question comes from when T is large not when the count is large (since we’re talking about the inferred count case).
> 
>> - The `static_assert` is going to fail differently on different platforms, since `T`'s size will tend to be platform-dependent for large `T`. I'm not sure it'll be predictable enough.
> 
> 
> I agree this is a problem.  One way to do that is to make the assert only trip on sizeof(void*)==8 hosts or something to make it more “incomplete but (more) stable”.
> 
> -Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201201/b0f279ac/attachment.html>