[LLVMdev] copy instructions

Wayne Cochran wcochran at vancouver.wsu.edu
Fri Apr 22 10:40:59 PDT 2011

This is a simple SSA code generation 101 question.

If I follow the IR code generation techniques in the Dragon book the
  x = y + z
would translate into something like this in SSA/LLVM
  %0 = add %y, %z
  %x = %0
Obviously "copy instructions" like %foo = %bar are senseless in SSA  
since %foo and %bar are immutably fixed to the same value and there
is no need for two aliases for the same thing (this is my own observation,
please tell me if my thinking is off).

What are the general code generation techniques to avoid "copy instructions"?

For example, the simple code generation methods that yield the translation
above might look like the following:

Value *AddExpression::codeGen() {
  Value *l = left->codeGen();
  Value *r = right->codeGen();
  Value *result = new TempValue;  // get unique temporary
  emit(result->str() + " add " + l->str() + ", " r-str());
  return result;

Value *assignExpression::codeGen() {
  Value *rval = rvalue->codeGen();
  Value *lval = new NameValue(ident);
  emit(lval->str() + " = " + rval->str());    // emit (silly) copy instruction
  return lval;

What I have suggested to my students is to omit the (non-existent) copy instruction
and use the "rval" above as a replacement for all future occurrences of "ident."
i.e., something like the following:

Value *assignExpression::codeGen() {
  Value *rval = rvalue->codeGen();
  update symbol table so that all future reference to "ident" are replaced with rval
  return rval;

Using this scheme, the following
  x = y + z
  u = x * y + foo(x)
would be translated into
  %0 = add %y, %z
  %1 = mul %0, %y
  %2 = call foo(%0)
  %3 = add %1, %2

Is there a more obvious approach to avoiding "copy instructions"?


Wayne O. Cochran
Clinical Assistant Professor, Computer Science
Washington State University Vancouver
wcochran at vancouver.wsu.edu

More information about the llvm-dev mailing list