[PATCH] D33104: [BitVector] Implement find_[first/last]_[set/unset]_in

Daniel Berlin via llvm-commits llvm-commits at lists.llvm.org
Wed May 17 08:57:45 PDT 2017


Yeah, I was about to suggest this.
We could specialize Optional for POD types so it doesn't spend time trying
to new a buffer to store them :)


On Wed, May 17, 2017 at 8:50 AM, Zachary Turner <zturner at google.com> wrote:

> At the risk of asking a stupid question, how about returning
> Optional<unsigned> (in a subsequent patch of course)?
>
> On Wed, May 17, 2017 at 8:36 AM Chandler Carruth via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
>> chandlerc accepted this revision.
>> chandlerc added a comment.
>> This revision is now accepted and ready to land.
>>
>> LGTM with the assert changes.
>>
>>
>>
>> ================
>> Comment at: llvm/include/llvm/ADT/BitVector.h:151
>> +  /// [Begin, End).  Returns -1 if all bits in the range are unset.
>> +  int find_first_in(unsigned Begin, unsigned End) const {
>> +    if (Begin >= End)
>> ----------------
>> zturner wrote:
>> > chandlerc wrote:
>> > > Given that we return an int, I would prefer that the arguments also
>> be int (and the intermediate variables within).
>> > The only reason we return an int is because we have to use a sentinel
>> value to indicate "no bits found".  The methods that these originated from
>> `find_next`, `find_prev`, etc have always taken `unsigned` arguments even
>> before I started tinkering on `BitVector`.  Do you still think it's a good
>> idea to change all of them?
>> I mean, my view is that we're doing a lot of arithmetic on these things
>> including potentially subtraction. So generally, unless there is a problem
>> with using a signed integer, I would prefer that.
>>
>> If we return `int`, then we've given up on having more than `2^31 - 1`
>> bits in one of these containers.
>>
>> That said, a bitvector is one of the few datastructures where it seems
>> entirely conceivable that we'll actually end up with 500mb vector of bits
>> and an `int` index won't work. But an `unsigned` won't either, so I suspect
>> that isn't the deciding factor and if we care we need a 64-bit type.
>>
>>
>> So I guess my feeling is that we should do one of two things here:
>> a) Switch the return types to an unsigned type and use something like
>> `npos`, or
>> b) Switch the argument types to a signed type and continue to use `-1`.
>>
>> (Whether we use 32-bits or 64-bits should be orthogonal as I see no
>> reason why for a *bit* count 2gb vs 4gb would be "enough".)
>>
>> However, looking at the code, I agree that the patch as-is remains
>> consistent, and you shouldn't do either (a) or (b) in *this* patch. But I'd
>> somewhat like to consider doing one of them in a subsequent patch.
>>
>> My personal, strong preference is to use (b) because I find asserts
>> easier to write for signed types in the face of arithmetic and we have
>> strong usage of UBSan to ensure we don't develop bugs here. I also have a
>> somewhat powerful dislike of `npos`. But I think either (a) or (b) would be
>> an improvement. Anyways, sorry for the digression.
>>
>>
>> ================
>> Comment at: llvm/include/llvm/ADT/BitVector.h:152-153
>> +  int find_first_in(unsigned Begin, unsigned End) const {
>> +    if (Begin >= End)
>>        return -1;
>>
>> ----------------
>> zturner wrote:
>> > chandlerc wrote:
>> > > Why isn't this an assert?
>> > I suppose I should assert if `Begin > End`, but `Begin == End` should
>> probably still just return -1 I think.
>> Ah, yes, that makes sense.
>>
>>
>> https://reviews.llvm.org/D33104
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170517/41d11ca4/attachment.html>


More information about the llvm-commits mailing list