[LLVMdev] Is address space 1 reserved?
Herbie Robinson
HerbieRobinson at verizon.net
Sat Jan 10 11:17:41 PST 2015
On 1/9/15 5:06 AM, David Chisnall wrote:
> On 9 Jan 2015, at 00:52, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>>> C requires that (void*)0 generates a pointer that does not compare equal to any valid pointer. It does not require that (void*)foo, where foo is an int of value 0 but not an integer constant expression, give the same value,
>> Does this mean constant propagation can change program semantics?
> Yes, that's one of the issues, if you do not enforce this guarantee for all pointers that are derived from integers that have a numerical value of 0. A strict reading of the C standard means that:
>
> void *null = 1-1; // Null pointer, 1-1 is an ICE
> int zero = 0;
> void *c = zero; // Not guaranteed to be null, zero is not an ICE. Will be null (almost?) everywhere, so programmers expect this to work.
> _Bool d = zero == (int)c; // Not guaranteed to be true, but will be (almost?) everywhere so programmers expect it to work.
> _Bool e = 0 == (int)null; // Guaranteed to be true
>
> Trivial constant propagation means that c will be a null pointer, but without it then it may be a pointer to some valid object (although whether you're actually allowed to construct a pointer like this is implementation defined).
>
> Some of my colleagues are working on a parameterisable formal specification for C, covering what the standard says, what compilers implement, and what programmers expect. There's a distressingly large amount that isn't in the intersection of these three.
>
> David
>
You have hit upon the heart of the matter. There is a huge base of
C/C++ code out there that assumes the zero is null irrespective of how
it gets declared.
A long time ago Stratus wrote a C compiler that had to be as compatible
as possible with PL1 -- We were confounded by the fact that we used 1
for a null pointer in PL1 (so it would fault on the 68K with most
accesses). We eventually came to the conclusion that there was no
chance of using 1 for a null pointer value and still be able to port
typical C code. And we try very, very hard to keep all the compilers on
our system (Our C, PL1, Cobol, Fortran, Pascal and GCC, now) compatible
with each other. Among other tings, we intermix C and PL1 in the
kernel. There is way too much casting of pointers to ints in typical C
code. There are even a lot of standardized APIs and DKIs that liberally
cast pointers to ints (the SVR4 DKI, for example). Of course, it's fine
to use a different null value for specialized environments -- just
expect porting issues if you are are bringing in arbitrary C code.
Another thing to bear in mind:
There is also standardized code that assumes there are at least 4
distinct pointer values that can't point to a valid memory address: Look
up SIG_DFL, SIG_ERR, SIG_HOLD and SIG_IGN in the POSIX standard. We
actually leave the entire page zero unmapped to allow for things like
this. Overkill, of course, but it's easy to drop an entire page and
it's also useful for catching most null pointer mishaps.
More information about the llvm-dev
mailing list