[LLVMdev] MachineOperand: GlobalAddress vs. ExternalSymbol

Sat Jun 19 17:51:01 PDT 2004

On Sat, 19 Jun 2004, Vladimir Prus wrote:

> > > There's another issue I don't understand. The module consists of
> > > functions and constants. I'd expect that external function declarations
> > > are also constants, with appropriate type. However, it seems they are not
> > > included in [Module::gbegin(), Module::gend()], insteads, they a Function
> > > objects with isExternal set to true.
> >
> > Module::gbegin/gend iterate over the global variables, and ::begin/end
> > iterate over the functions, some of which may be prototypes.  Function
> > prototypes aren't really any more "constant" than other functions are.
>
> I disagree. Say there's declaration of external function "printf". Then
> it's just a constant global address. In assembler it will be
>
>    extern printf: label;
>
> which is not that different from assembler for other constants. For
> example, for external data reference I have to produce the same
> assembler.

Of course, global variable addresses are link-time constants.  This is the
motivation for the ConstantPointerRef class, and is why GlobalValue will
eventually derive from Constant (see earlier discussion).

My point was that function prototypes are no different than functions with
bodies in this respect.  Also, you have to be careful to distinguish
between the fact that the *address* of a global is always a constants,
regardless of whether it is a global variable, function, internal, or
external... but the *contents* of a global variable are only sometimes
constant (indicated by GlobalVariable::isConstant()).

> BTW, there's inconsistency in how X86 backend handles constants and
> functions.  Consider:
>
> %.str_1 = constant [11 x sbyte] c"'%c' '%c'\0A\00"
>
> implementation   ; Functions:
>
> declare int %printf(sbyte*, ...)
>
> int %main() {
> entry:
>         %tmp.0.i = call int (sbyte*, ...)*
>         %printf( sbyte* getelementptr ([11 x sbyte]*  %.str_1, long 0, l
>         ret int 0
> }
>
>
> The assembler produces by X86 backend is:
>
>         call printf
> ........
>         .globl _2E_str_1
>         .data
>         .align 1
>         .type _2E_str_1, at object
>         .size _2E_str_1,11
> _2E_str_1:
>
> That is, the name of "str1" is mangled, but the name of function is not. I
> don't see the reasons for different handling of those two kinds of names.

We don't mangle names unless we have to.  The NameMangler interface
encapsulates this behavior.  In particular we only mangle a name if it's
an internal symbol, if there are two globals named the same thing, or if
there is an invalid character (like '.') in the name.

> > > To me this seems a bit confusing -- it would be clearer if there we
> > > plain functions with bodies and everything else were GlobalValue.
> >
> > The reason that we don't want to do this is that it makes it more
> > difficult to create a function and then fill in its body.  Currently when
> > you create a function, you get a prototype.  When you fill in its body,
> > you now have a defined function.  In your scheme, the function prototype
> > and defined function objects would be different: to go from one to the
> > other, you would have to delete the object and reallocate it.
>
> Can't you store all functions in the list of global values? That would

Sure, we could have one unified list I guess.  It is very common to want
to iterate over just globals or just functions though.  *shrug* I don't
think it makes that much of a difference, do you?

> be quite clear: all top-level module elements are global values, and a
> present in the global list.

All top-level module elements DO derive from GlobalValue, and are all
present in the gbegin/gend and begin/end lists owned by Module.

> The functons list can contains either both functions with bodies or
> without, or only with bodies. In the latter case, when you create
> function, it's added only to global values list. When you add the first
> basic block, it's also added to the list of functions.

I really don't see the advantage of this.  You're talking about adding a
third list that contains a union of the two?  I don't see what you are
buying here.

> Thanks for explanation. I don't have a use of SymbolTable yet, I was just
> wondering if I have to use it for something ;-)

You probably won't.  If you find yourself needing to, ask before you do as
there is probably an easier was to do whatever it is that you want to do.
:)

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/