[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."

Sat Jun 8 10:17:43 PDT 2019

Hi,

The LLVM coding style does not specify anything about the use of
signed/unsigned integer, and the codebase is inconsistent (there is a
majority of code that is using unsigned index in loops today though).

I'd like to suggest that we specify to prefer `int` when possible and use
`unsigned` only for bitmask and when you intend to rely on wrapping
behavior, see: https://reviews.llvm.org/D63049

A lot has been written about this, good references are [unsigned: A
Guideline for Better Code] https://www.youtube.com/watch?v=wvtFGa6XJDU) and
[Garbage In, Garbage Out: Arguing about Undefined Behavior...](
https://www.youtube.com/watch?v=yG1OZ69H_-o), as well as this panel
discussion:
- https://www.youtube.com/watch?v=Puio5dly9N8#t=12m12s
- https://www.youtube.com/watch?v=Puio5dly9N8#t=42m40s

Other coding guidelines already embrace this:

- http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-signed
- http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-unsigned
- https://google.github.io/styleguide/cppguide.html#Integer_Types

It is rare that overflowing (and wrapping) an unsigned integer won't
trigger a program bug when the overflow was not intentionally handled.
Using signed arithmetic means that you can actually trap on over/underflow
and catch these bugs (when using fuzzing for instance).

Chandler explained this nicely in his CPPCon 2016 talk "Garbage In, Garbage
Out: Arguing about Undefined Behavior...". I encourage to watch the full
talk but here is one relevant example: https://youtu.be/yG1OZ69H_-o?t=2006
, and here he mentions how Google experimented with this internally:
https://youtu.be/yG1OZ69H_-o?t=2249

Unsigned integer also have a discontinuity right to the left of zero.
Suppose A, B and C are small positive integers close to zero, say all less
than a hundred or so. Then given:
A + B > C
and knowing elementary school algebra, one can rewrite that as:
A > B - C
But C might be greater than B, and the subtraction would produce some huge
number. This happens even when working with seemingly harmless numbers like
A=2, B=2, and C=3.

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190608/18a819b2/attachment.html>