[LLVMdev] Uninitialized variable - question

Kuperstein, Michael M michael.m.kuperstein at intel.com
Sun Nov 25 08:02:47 PST 2012

Just adding my 2 cents - to the best of my understanding, C99 also makes this behavior undefined explicitly.

>From the standard (I think it's the draft C99-TC2):
Sec "If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate."
Appendix J.2:  "The behavior is undefined in the following circumstances: […] The value of an object with automatic storage duration is used while it is indeterminate (6.2.4, 6.7.8, 6.8)."

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nick Lewycky
Sent: Sunday, November 25, 2012 12:46
To: john skaller
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Uninitialized variable - question

On 11/24/2012 04:14 AM, john skaller wrote:
> On 24/11/2012, at 10:21 PM, Nick Lewycky wrote:
>> Passing an uninitialized value as a function argument is undefined behaviour on the spot, regardless of what the callee does (even if it never references that argument).
> Cite reference? No? Then you're guessing ;)

This is a rule in C++ that I'm not sure also applies to C. The applicable text from N3376 is:

   "When a function is called, each parameter shall be initialized with its corresponding argument." [expr.call]/4

the initialization performs lvalue-to-rvalue conversion, which is what ultimately triggers explicit UB:

   "If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior." [conv.lval]/1

>> That aside, there is no way that 'i' has the same value, since it has no value.
> This is definitely NOT correct in ISO C. It has an unspecified value, 
> and in C99 that may be a "trap value".
> You state the rules generically but both C99 and C++ have special 
> rules for unsigned char, where use of an uninitialised value is 
> definitely not undefined.

Not to put words in your mouth, but I think you might have been trying to refer to this:

   "For unsigned character types, all possible bit patterns of the value representation represent numbers." [basic.fundamental]

? If so, I don't think that means quite what you think it means. Instead of changing the effect of being uninitialized (or having "indeterminate value" in C++ parlance) -- rather, it precludes the possibility of trap bits. Another way to look at it is that an unsigned char is safe to use to examine any byte (pointer aliasing rules dealt with elsewhere).

And again, I believe that C has effectively the same rules for unsigned char that C++ does, though I haven't a copy of the C standard handy to verify this (I'm on vacation).

>> I should mention that the above is for C++, and I don't have a copy of any of the standards handy, but I expect the rules to be the same for C and C++ here.
> It's VERY unwise to make such assumptions regarding conformance issues 
> since C and C++ have completely distinct conformance models.
> They also treat uninitialised variables distinctly: the rules in C++ 
> were constructed independently of ISO C rules, particularly as in C++ 
> there are classes with constructors etc to consider, and generalised 
> rules covering such cases as well as scalars and aggregates are likely 
> to be distinct and have different consequences in their details.
> One must be aware that Standards are imperfect documents and often 
> specifications in one place are incomplete or even wrong, unless some 
> other place is also considered. You need to be a Standards guru to 
> really know where to find all the relevant clauses.
> Even then, as I pointed out in my prior post on this topic, the 
> Standard itself can be inconsistent, or fail to achieve a normative 
> requirement despite the intent of the committee. This is the case with 
> integer representation rules in C99: it looks reasonable but is 
> actually non-normative gibberish. However the rules do have an impact, 
> and they have a very unfortunate impact in over-constraining integer 
> representations.
> In particular if, by specification of your vendor, you have a full 
> twos complement representation of "int" all possible values of an 
> uninitialised int variable are valid ints and the behaviour of all 
> mathematical operations and copying is then specified by the usual 
> rules: it's undefined only if there is overflow, division by zero, or whatever.
> On a 64 bit machine like x86_64 the usual representations of integers 
> are full, and therefore copying and other operations are well defined 
> (allowing for undefined behaviour on division by zero etc).
> In particular:
> 	int x[2]; int y[2];
> 	x[0]=y[0];
> 	x[1]=y[1];
> is a perfectly valid way to copy a possibly incompletely filled array 
> provided int has a full representation.
> Historically, C was a real mess, with the most traditional copying of 
> arrays of chars aliasing other values being undefined.
> C++ did NOT follow C here. It invented its own, more consistent, set
> of rules. Not sure about C++11 though.

Fair enough! The intent of my argument was to point out that -- surely in C++, and I think in C as well -- the fact that uninitialized values may have different values each time you look at them is made possible not because the standard says so, but because it fails to say what value you will observe. It's undefined behaviour not explicitly (as in my lvalue-to-rvalue conversion text above), but rather by the absence of any text for us to quote.

LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

More information about the llvm-dev mailing list