[LLVMdev] LLVM 2.4 problem? (resend)

Mike Stump mrs at apple.com
Wed Oct 15 12:53:18 PDT 2008

On Oct 15, 2008, at 8:28 AM, Chris Lattner wrote:
> On Oct 15, 2008, at 6:58 AM, Tatu Vaajalahti wrote:
>>> Yes, but why do you think they should get a different address?  I  
>>> can
>>> understand that it is surprising that they do, but determining
>>> whether
>>> this is legal or not requires reading the language standard.
>>> Hopefully
>>> a language lawyer can chime in and say whether this transform is
>>> valid
>>> or not.
>> I agree the whole construction is a litle bit strange (stupid even).
>> It is however common way to specify context identity in one  
>> Objective-
>> C pattern (although I don't think anyone actually uses initialized
>> const variables, I was just playing with them to see how compilers  
>> put
>> stuff in segments).
>> I do think however that it's bit dangerous to combine static  
>> constants
>> across compilation units.
> GCC does the same things with strings in some cases.  You shouldn't
> depend on this behavior if you want portable code.  If you avoid
> marking the global variable const, you should have better luck.

You all are wrong.  Amazingly so.

First, String literals and objects are different.  String literals are  
defined like this:

2 Whether all string literals are  distinct  (that  is,  are  stored  in
   nonoverlapping  objects)  is  implementation-defined.

That applies _only_ to string literals, absolutely nothing else.   
Objects are defined like so:

Two pointers of
   the same type compare equal if and only if they are  both  null,   
   point  to  the same object or function, or both point one past the  
   of the same array.

This means they _must_ compare !=, if they are different objects.   
Wether are the same object or or not is answered by the notion of  

8 An identifier used in more than one translation unit  can  potentially
   refer  to  the same entity in these translation units depending on  
   linkage (_basic.link_) of the identifier specified in each  

2 A  name  is said to have linkage when it might denote the same object,
   reference, function, type, template, namespace  or  value  as  a   
   introduced by a declaration in another scope:

to be pedantically clear, entity includes objects:

3 An  entity  is a value, object, subobject, base class subobject, array
   element, variable, function, instance of a function, enumerator,  
   class member, template, or namespace.

Now, you ask, how can we be sure these have no linkage across  
translation units, because:

3 A name having namespace scope (_basic.scope.namespace_)  has  internal
   linkage if it is the name of

   --an   object,  reference,  function  or  function  template  that   

     explicitly declared static or,

We know that they do not denote the same object because the rules that  
guide us when they do are not met:

9 Two names that are the same (clause _basic_) and that are declared  in
   different  scopes  shall  denote the same object, reference,  
   type, enumerator, template or namespace if

   --both names have external linkage or else both  names  have   
     linkage and are declared in the same translation unit; and

   --both names refer to members of the same namespace or to members,  
     by inheritance, of the same class; and

   --when  both  names denote functions, the function types are  
     for purposes of overloading; and

   --when  both  names  denote   function   templates,   the    
     (_temp.over.link_) are the same.

We know that they cannot have linkage across translation units because:

   --When  a  name  has  external  linkage,  the entity it denotes can  
     referred to by names from scopes of other translation units or   
     other scopes of the same translation unit.

   --When  a  name  has  internal  linkage,  the entity it denotes can  
     referred to by names from other scopes in the same translation  

Welcome to C and C++ 101.  I'm amazed that this isn't as plan as day  
to anyone that works on a compiler.  Kinda basic stuff.  Ignorance of  
the rules doesn't mean you can't just read the words of the standard.   
You don't have to guess.

The standard is meant to be fairly accessible:

Every byte  has  a  unique address.

1 The fundamental storage unit in the C++ memory model is the  byte.

5 Unless it is a bit-field (_class.bit_), a most  derived  object  shall
   have  a  non-zero  size and shall occupy one or more bytes of  

So, let me state is this way, the address _must_ be different.  If you  
can't tell they are not, you are free to have them be the same.

More information about the llvm-dev mailing list