[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."

Mehdi AMINI via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 12 08:50:37 PDT 2019


On Wed, Jun 12, 2019 at 8:05 AM Krzysztof Parzyszek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> > The underlying problem is that the C family of languages mixes two
> orthogonal properties: value range and overflow behavior. There is no
> unsigned type with undefined wraparound. So the question becomes: What
> property is more important to reflect? Do we want catch unintended
> wraparound behavior using a sanitizer/make optimizations based on it?
>
> That's a valid argument, but I suspect that the vast majority of loops
> using an unsigned induction variable, start at an explicit 0 and go up by
> 1.  Such loops cannot overflow, so there is nothing to catch there.
>

This isn't entirely true: some comparison can exist in the code `if (idx -
1 < N) { ....}` which does not give the same result when `idx` is unsigned
and zero.

Just like Michael mentioned: there are two aspects of unsigned, and the
wrap-around behavior is the one causing bugs. After being bitten by
unsigned wrap-around twice a year, and because debugging these isn't fun
(and correctness is important and hard), my code reading (and others, as I
sourced initially) has adjusted to be suspicious and careful around
unsigned: if I see unsigned I need to check more invariants when I read the
code.
Obviously this is a tradeoff: losing the "this can't be negative" property
enforced in the type system is sad, but I haven't encountered any bugs or
issue caused by using int in place of unsigned (while the opposite is true).
(any bug caused "negative indexing of a container" would be as much of a
bug with unsigned, please watch Chandler's talk).

-- 
Mehdi





>
>
> --
> Krzysztof Parzyszek  kparzysz at quicinc.com   LLVM compiler development
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Michael
> Kruse via llvm-dev
> Sent: Tuesday, June 11, 2019 2:26 PM
> To: Zachary Turner <zturner at roblox.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Aaron Ballman <
> aaron.ballman at gmail.com>
> Subject: [EXT] Re: [llvm-dev] [RFC] Coding Standards: "prefer `int` for
> regular arithmetic, use `unsigned` only for bitmask and when you intend to
> rely on wrapping behavior."
>
> Am Di., 11. Juni 2019 um 11:45 Uhr schrieb Zachary Turner via llvm-dev
> <llvm-dev at lists.llvm.org>:
> >
> > I'm personally against changing everything to signed integers.  To me,
> this is an example of making code strictly less readable and more confusing
> in order to fight deficiencies in the language standard.  I get the problem
> that it's solving, but I view this as mostly a theoretical problem, whereas
> being able to read the code and have it make sense is a practical problem
> that we must face on a daily basis.  If you change everything to signed
> integers, you may catch a real problem with it a couple of times a year.
> And by "real problem" here, I'm talking about a miscompile or an actual bug
> that surfaces in production somewhere, rather than a "yes, it seems
> theoretically possible for this to overflow".
>
> Doesn't it make it already worth it?
>
>
> > On the other hand, a large number of people need to work in this
> codebase every day, and multiplied over the same time period, my belief is
> that having the code make sense and be simple has a higher net value.
> >
> > It simply doesn't make sense (conceptually) to use a signed type for
> domains that are inherently unsigned, like the size of an object.  IMO, we
> should revisit this if and when the deficiencies in the C++ Standard are
> addressed.
>
> The underlying problem is that the C family of languages mixes two
> orthogonal properties: value range and overflow behavior. There is no
> unsigned type with undefined wraparound. So the question becomes: What
> property is more important to reflect? Do we want catch unintended
> wraparound behavior using a sanitizer/make optimizations based on it?
> Do we need the additional range provided by an unsigned type? As Chandler
> says in one of his talks linked earlier: "If you need more bits, use more
> bits" (such as int64_t).
>
> Michael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/7e310ada/attachment.html>


More information about the llvm-dev mailing list