[PATCH] D33104: [BitVector] Implement find_[first/last]_[set/unset]_in

Zachary Turner via llvm-commits llvm-commits at lists.llvm.org
Wed May 17 10:05:54 PDT 2017


I mean, if we're gonna go through the trouble of changing the return value
or the arguments to *anything*, then we already have to deal with the
potential consequences of fixing up every single call site.  So it's hard
to avoid changing the interface without rocking the boat.  And if we're
going to rock the boat anyway, might as well do it in the way that everyone
prefers.

On Wed, May 17, 2017 at 10:04 AM Chandler Carruth <chandlerc at google.com>
wrote:

> On Wed, May 17, 2017 at 9:50 AM Zachary Turner via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> At the risk of asking a stupid question, how about returning
>> Optional<unsigned> (in a subsequent patch of course)?
>>
>
> It seems a shame to have a separate bool rather than using the fact that
> '-1' (or 'npos' equivalently) is a sentinel value indicating failure.
>
> But I can't imagine it matters a lot for these routines.
>
> However, I don't think it does much to address the reasons I generally
> prefer signed arithmetic here. If your goal is just to not rock the boat
> and make everything unsigned, I don't guess I care much between `npos` and
> `Optional<unsigned>`.
>
> -Chandler
>
>
>> On Wed, May 17, 2017 at 8:36 AM Chandler Carruth via Phabricator <
>> reviews at reviews.llvm.org> wrote:
>>
>>> chandlerc accepted this revision.
>>> chandlerc added a comment.
>>> This revision is now accepted and ready to land.
>>>
>>> LGTM with the assert changes.
>>>
>>>
>>>
>>> ================
>>> Comment at: llvm/include/llvm/ADT/BitVector.h:151
>>> +  /// [Begin, End).  Returns -1 if all bits in the range are unset.
>>> +  int find_first_in(unsigned Begin, unsigned End) const {
>>> +    if (Begin >= End)
>>> ----------------
>>> zturner wrote:
>>> > chandlerc wrote:
>>> > > Given that we return an int, I would prefer that the arguments also
>>> be int (and the intermediate variables within).
>>> > The only reason we return an int is because we have to use a sentinel
>>> value to indicate "no bits found".  The methods that these originated from
>>> `find_next`, `find_prev`, etc have always taken `unsigned` arguments even
>>> before I started tinkering on `BitVector`.  Do you still think it's a good
>>> idea to change all of them?
>>> I mean, my view is that we're doing a lot of arithmetic on these things
>>> including potentially subtraction. So generally, unless there is a problem
>>> with using a signed integer, I would prefer that.
>>>
>>> If we return `int`, then we've given up on having more than `2^31 - 1`
>>> bits in one of these containers.
>>>
>>> That said, a bitvector is one of the few datastructures where it seems
>>> entirely conceivable that we'll actually end up with 500mb vector of bits
>>> and an `int` index won't work. But an `unsigned` won't either, so I suspect
>>> that isn't the deciding factor and if we care we need a 64-bit type.
>>>
>>>
>>> So I guess my feeling is that we should do one of two things here:
>>> a) Switch the return types to an unsigned type and use something like
>>> `npos`, or
>>> b) Switch the argument types to a signed type and continue to use `-1`.
>>>
>>> (Whether we use 32-bits or 64-bits should be orthogonal as I see no
>>> reason why for a *bit* count 2gb vs 4gb would be "enough".)
>>>
>>> However, looking at the code, I agree that the patch as-is remains
>>> consistent, and you shouldn't do either (a) or (b) in *this* patch. But I'd
>>> somewhat like to consider doing one of them in a subsequent patch.
>>>
>>> My personal, strong preference is to use (b) because I find asserts
>>> easier to write for signed types in the face of arithmetic and we have
>>> strong usage of UBSan to ensure we don't develop bugs here. I also have a
>>> somewhat powerful dislike of `npos`. But I think either (a) or (b) would be
>>> an improvement. Anyways, sorry for the digression.
>>>
>>>
>>> ================
>>> Comment at: llvm/include/llvm/ADT/BitVector.h:152-153
>>> +  int find_first_in(unsigned Begin, unsigned End) const {
>>> +    if (Begin >= End)
>>>        return -1;
>>>
>>> ----------------
>>> zturner wrote:
>>> > chandlerc wrote:
>>> > > Why isn't this an assert?
>>> > I suppose I should assert if `Begin > End`, but `Begin == End` should
>>> probably still just return -1 I think.
>>> Ah, yes, that makes sense.
>>>
>>>
>>> https://reviews.llvm.org/D33104
>>>
>>>
>>>
>>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170517/e92cc727/attachment.html>


More information about the llvm-commits mailing list