[LLVMdev] Uninitialized variable - question

Nick Lewycky nicholas at mxc.ca
Sun Nov 25 02:46:13 PST 2012


On 11/24/2012 04:14 AM, john skaller wrote:
>
> On 24/11/2012, at 10:21 PM, Nick Lewycky wrote:
>
>>
>> Passing an uninitialized value as a function argument is undefined behaviour on the spot, regardless of what the callee does (even if it never references that argument).
>
> Cite reference? No? Then you're guessing ;)

This is a rule in C++ that I'm not sure also applies to C. The 
applicable text from N3376 is:

   "When a function is called, each parameter shall be initialized with 
its corresponding argument." [expr.call]/4

the initialization performs lvalue-to-rvalue conversion, which is what 
ultimately triggers explicit UB:

   "If the object to which the glvalue refers is not an object of type T 
and is not an object of a type derived from T, or if the object is 
uninitialized, a program that necessitates this conversion has undefined 
behavior." [conv.lval]/1

>> That aside, there is no way that 'i' has the same value, since it has no value.
>
> This is definitely NOT correct in ISO C. It has an unspecified value,
> and in C99 that may be a "trap value".
>
> You state the rules generically but both C99 and C++ have special
> rules for unsigned char, where use of an uninitialised value
> is definitely not undefined.

Not to put words in your mouth, but I think you might have been trying 
to refer to this:

   "For unsigned character types, all possible bit patterns of the value 
representation represent numbers." [basic.fundamental]

? If so, I don't think that means quite what you think it means. Instead 
of changing the effect of being uninitialized (or having "indeterminate 
value" in C++ parlance) -- rather, it precludes the possibility of trap 
bits. Another way to look at it is that an unsigned char is safe to use 
to examine any byte (pointer aliasing rules dealt with elsewhere).

And again, I believe that C has effectively the same rules for unsigned 
char that C++ does, though I haven't a copy of the C standard handy to 
verify this (I'm on vacation).

>> I should mention that the above is for C++, and I don't have a copy of any of the standards handy, but I expect the rules to be the same for C and C++ here.
>
> It's VERY unwise to make such assumptions regarding conformance
> issues since C and C++ have completely distinct conformance models.
> They also treat uninitialised variables distinctly: the rules in C++ were
> constructed independently of ISO C rules, particularly as in C++ there
> are classes with constructors etc to consider, and generalised rules
> covering such cases as well as scalars and aggregates are likely
> to be distinct and have different consequences in their details.
>
> One must be aware that Standards are imperfect documents and often
> specifications in one place are incomplete or even wrong, unless
> some other place is also considered. You need to be a Standards guru
> to really know where to find all the relevant clauses.
>
> Even then, as I pointed out in my prior post on this topic, the Standard
> itself can be inconsistent, or fail to achieve a normative requirement
> despite the intent of the committee. This is the case with integer
> representation rules in C99: it looks reasonable but is actually
> non-normative gibberish. However the rules do have an impact,
> and they have a very unfortunate impact in over-constraining
> integer representations.
>
> In particular if, by specification of your vendor, you have a full
> twos complement representation of "int" all possible values of an uninitialised
> int variable are valid ints and the behaviour of all mathematical operations
> and copying is then specified by the usual rules: it's undefined only if there
> is overflow, division by zero, or whatever.
>
> On a 64 bit machine like x86_64 the usual representations of
> integers are full, and therefore copying and other operations
> are well defined (allowing for undefined behaviour on division
> by zero etc).
>
> In particular:
>
> 	int x[2]; int y[2];
> 	x[0]=y[0];
> 	x[1]=y[1];
>
> is a perfectly valid way to copy a possibly incompletely
> filled array provided int has a full representation.
>
> Historically, C was a real mess, with the most traditional
> copying of arrays of chars aliasing other values being undefined.
>
> C++ did NOT follow C here. It invented its own, more consistent, set
> of rules. Not sure about C++11 though.

Fair enough! The intent of my argument was to point out that -- surely 
in C++, and I think in C as well -- the fact that uninitialized values 
may have different values each time you look at them is made possible 
not because the standard says so, but because it fails to say what value 
you will observe. It's undefined behaviour not explicitly (as in my 
lvalue-to-rvalue conversion text above), but rather by the absence of 
any text for us to quote.

Nick



More information about the llvm-dev mailing list