[LLVMdev] LLVM 2.4 problem? (resend)

Mike Stump mrs at apple.com
Wed Oct 15 22:45:53 PDT 2008


On Oct 15, 2008, at 6:16 PM, David Vandevoorde wrote:
>> A distinction without a difference.  They are trying to word smith
>> this to get it right, however, the underlying semantics are trivial
>> and not in dispute.
>
> On the contrary: Active CWG members disagree on this topic.

My claim would be that the people that wrote the standard knew exactly  
what it meant in this area. We spent quite a bit of time discussing it.

>> The part you seem to be missing is kinda
>> fundamental, unless there is linkage, there is no linkage.  What
>> linkage means is exactly the notion of if two entities refer to the
>> same object or not, or to put it another way, if two entities have  
>> the
>> same address.
>
> You're mixing your terms: Linkage is a property of names and types;

Yeah, I should have said:

>> The part you seem to be missing is kinda
>> fundamental, unless there is linkage, there is no linkage.  What
>> linkage means is exactly the notion of if two names denote the
>> same object or not, or to put it another way, if two names have the
>> same address.

but notice, this does change anything unless you want to just argue  
about my choice of words and not the meat of the topic.  Anyway, no,  
linkage doesn't have to do with types, just names, for example, names  
of types.  Anyway, yes, one cannot hope to understand what a name  
denotes, without understanding things like linkage.  And no, it isn't  
a mixup:

5 Every  name  that  denotes  an  entity is introduced by a declaration.

and:

8 An identifier used in more than one translation unit  can  potentially
   refer  to  the same entity in these translation units depending on  
the
   linkage (_basic.link_) of the identifier specified in each  
translation
   unit.

> not entities (and not objects).

By answering the question, do these two names have linkage, we answer  
the question, do these names refer to the same entity more generally,  
or in the case at hand, do these names refer to the same object.

> Furthermore, linkage deals with cross-scope (mostly, cross-TU)  
> correspondences.  The question here can arise
> within a scope.

That answer in our case is no, they cannot.  You have been shown the  
wording of the standard that states this.  You have not cited anything  
that contradicts this.  You can't win on the:

> We do think that a strict reading of the standard allows the  
> optimization

statement without actually having a reason for that view that is  
backed by the standard.  I've not seen that backing.
   I'm trying to change your mind.  I'm plan on doing this by  
identifying the part were we diverge in our reading of the standard.   
I can't find that point, if you are unwilling to cite it.  Given that  
point, I can then help you understand why your reading is wrong, or,  
if it isn't, then I can then adopt your view point.

>> Pointing off the end of an array doesn't necessarily
>> yield a pointer to an object.  It represents a value that can be
>> manipulated to refer to an object, say, with p-1.
>>
>> I can quote n2461 if you prefer:
>>
>> Two pointers of the same type compare equal if and only if they are
>> both
>> null, both point to the same function, or both represent the same
>> address (3.9.2).
>>
>> A valid value of an object pointer type represents either the address
>> of a byte in memory (1.7) or a null pointer (4.10). If an object of
>> type T is located at an address A, a pointer of type cv T* whose  
>> value
>> is the address A is said to point to that object, regardless of how
>> the value was obtained.
>>
>> Static and automatic storage durations are associated with objects
>> introduced by declarations (3.1) and implicitly cre-
>> ated by the implementation (12.2).
>>
>> A declaration (clause 7) introduces names into a translation unit or
>> redeclares names introduced by previous declarations.
>>
>> An object is a region of storage.
>>
>> An object is created by  a  definition  (_basic.def_)
>
> basic.def doesn't contain the word "create" nor words to that effect,

You must not have searched the standard for those quotes, as they are  
easily findable.  See 1.8p1 in n2461 for this one.

> AFAICT.  Furthermore, basic.life 3.8/7 makes it clear that an object
> may be created on top of another object, and the name of the first
> object will correctly evaluate to the new object.

Yes, this, just in case you don't know why it is there, is because of  
things like placement new, and things like declaring:

char buf[10000], and then allocating objects out of it and using  
them.  This isn't possible in the confines of the standard, without  
all the various special case wording that defines exactly what it means.

It is irrelevant to the question at hand.  The storage for the objects  
at hand are guided by:

All objects which neither have dynamic storage duration nor are local  
have static storage duration. The storage for these
objects shall last for the duration of the program (3.6.2, 3.6.3).

By last, we mean, is allocated and not reused by the compiler or  
runtime system.  Only after the program ends, does the standard allow  
for the storage to be remained or reused.

> So that reasoning doesn't hold.

Try that again.  To speed the process, identify where we depart in our  
reading of the standard.  If you just got lost, because you didn't  
realize that objects are created by definitions, well, that's easy to  
correct.

>> Each declaration _creates_ an object.  The word create means that  
>> each
>> has a region of bytes, distinct from all others.
>
> No.

Great, a one word reply.  Sorry, that doesn't cut it in my book.  To  
win, you have to state what your interpretation is.  If your  
contention is that:

int i;
int j;

i = 1;
j = 2;

printf("%d", i);

can print 2 within the confines of the standard, then I can quickly  
dismiss you as a quack and we can end the thread.  The problem is, I  
can't tell what your position is.  I'm reasonably certain that your  
position can't be that the above could print 2, and yet, without the  
understanding that with the declaration of i and j above that this  
creates an object for each, and that this means that each has a  
different address, I'm left with you having no possible sane view of  
the standard.  Once you admit that the above _must_ print 1, then all  
the reasoning you used to figure that out also applies to:

static int i = 1;
static int j = 2;

and likewise to

static const int i = 1;
static const int j = 1;

The only exception, would be some sentence that talked specifically  
about this case and granted the implementation the latitude to merge  
them.  You've not cited that.

>>> The changes to address Core issue 73 invalidates your reasoning
>>
>> No, they don't.  I'm describing a fundamental feature of C and C++
>> that cannot be disputed.
>
> It sure can.  In fact it _is_ being disputed.

No.  It cannot be.  You will discover that the standard isn't useful  
at all if there if there is a dispute, and that means, that the person  
trying to suggest the standard isn't useful at all, is wrong.  We  
ignore these types of people as lunatics.  I'd suspect they are merely  
talking about changing the standard to permit merging.

>> The most that can be done is to add a rule
>> that says objects declared with a const type can be merged, meaning
>> that objects with the same value may (implementation defined or
>> unspecified) refer to the address of one of them.  A rule like this
>> could be added to C or C++, but has not been to the best of my
>> knowledge.
>
>
> The current consensus among CoreWG experts is that the words in the
> current standard (and those in the current WP) do not require distinct
> variables

They are wrong.  I'm sorry to hear it.  It would be a misreading of  
the standard to think it is not required.  I can concede that the  
wording could be improved.

When things don't have to be distinct we explicitly spelled it out:

   It is unspeciļ¬ed whether such a variable has an address distinct  
from that of any other object in the program.

to countermand the mandate that would otherwise exist.  If that  
weren't the case, we would have removed the wording as redundant.  You  
can also see evidence of our thinking in wording like:

   --in each definition of D, corresponding names, looked up according  
to
     _basic.lookup_, shall refer to an entity defined within the   
defini-
     tion of D, or shall refer to the same entity, after overload  
resolu-
     tion (_over.match_) and after matching of partial template   
special-
     ization  (_temp.over_),  except  that  a  name  can refer to a  
const
     object with internal or no linkage if the object has the same   
inte-
     gral  or enumeration type in all definitions of D, and the object  
is
     initialized with a constant expression (_expr.const_), and the  
value
     (but  not the address) of the object is used, and the object has  
the
     same value in all definitions of D; and

as well.  Here, because the values are the same, we allow them in an  
ODR sense.  We knew that the address had to be distinct, and so, we  
don't permit the use of the address.

> and temporaries to have distinct addresses per se.  (In most
> cases, distinct addresses are the only possible implementation, but
> even without const qualification there are cases that could be
> implemented with overlapping storage within the standard's words.
> E.g., if variables are address-exposed but never value-exposed.)

Curiously enough, it is the address-exposed cases that give rise to  
the requirement they be distinct.  If the address isn't exposed, and  
one can't otherwise tell, then they can be merged.  Consuming the  
address is but one way a person is able to notice that they were  
merged, and if they can notice it, then the implementation is not able  
to use the as-if rule.

> There is currently _no_ CoreWG consensus about whether that  
> situation is intended and/or desirable.

I can go set them straight, if I thought they were at all in any  
danger of getting it wrong.  They won't.  If they want to make things  
mergable, that would be a change, but a small one.  If they did that,  
you'd notice that they would change the wording to permit merging in  
the narrow case and to spell out the fact that in all other cases, the  
objects are distinct, don't overlap and cannot be merged.



More information about the llvm-dev mailing list