[LLVMdev] FW: question about malloc call vs. instruction

Wed Sep 11 17:48:00 PDT 2002

On Wed, 11 Sep 2002, lee white baugh wrote:

> hi chris, thanks for answering that question.  now i've another!  for the

Bring them on!  :)

> svr task, i'll need to know when a struct is being mallocd or allocad.
> but last night when getting started on the task, i got far enough to see
> that while in the code i was allocing a struct, in the bytecode it was
> allocing a ubyte or something -- the information that it was a struct, and
> hence the handle i had in the llvm framework for finding out about the
> struct, went away.  do you know why?

This is an important aspect of how the LLVM framework is designed, and a
side effect of how C works.  In C, the following code says nothing about
the type generated by the malloc:

X = malloc(sizeof(some type));

Malloc always returns a void*, and thus malloc just knows the size of the
allocated object, not it's type.

Obviously, this poses a problem for LLVM, a typed representation.  In
LLVM, the level raising pass attempts to reverse engineer the type
information depending on how the values are used.  For example, in this
code, it should correctly deduce the type information (by using the
argument type of the "noop" function:

void noop(struct foo *Arg) {}

void myfunc() {
  struct foo *X = (struct foo*)malloc(sizeof(struct foo));
  noop(X);
  free(X);
}

... but it would not without the call.  Without the call, it would
probably assume it's an array of characters (sizeof foo) big.  This is
semantically correct (the program will execute correctly), but not what
you want for this MP.  :)

Because of this, you testing process should look like this:

1. Write your C code, compile to LLVM bytecode.
2. Visually inspect the LLVM code.  If the type information is "lost",
   insert a call to a "noop" type function above.
3. recompile, reinspect.  You should have type information.  iterate until
   you're happy.
4. Now you have a bytecode file with extra function calls you don't want.
   Run the -inline pass to eliminate them, and probably the -simplifycfg
   pass to clean up the result:
     opt < test.bc -inline -simplifycfg > test2.bc
5. Not run your optimization on test2.bc
     opt < test2.bc -scalarreplacement | dis

... and hopefully it works.  :)

-Chris

http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/