[LLVMdev] How to translate library functions into LLVM IR bitcode?

Johannes Doerfert doerfert at cs.uni-saarland.de
Tue Sep 16 06:00:41 PDT 2014


Hey Liwei,

I attached a script I used some time back to compile multiple source
files (of a benchmark) into one bitcode file (instead of an executable).
The script is very simple but worked surprisingly well. It checks the
command line options for indicators what kind of output is expected and
produces bitcode files with the appropriate names. In order to use it
you just need to put/link the script on your path and use it as your
standard compiler (e.g., export CC=<script_name>) instead of clang.
However, clang and llvm-link need to be on the path. If the name of the
executed script is <script_name>++ (note the ++ in the end) then clang++
will be used to compile the source files, otherwise clang. Also note that
the script will remove some command line options as they are not supported
or desired when creating bitcode instead of object code files.

It should also be able to pass a usual autoconf configure run by
detecting it and simply redirecting the complete command line to clang(++).
But I never used it to link libraries though.

I'm not sure if you can use this script as is or as a starting point to
create your own but maybe you can. 


Best regards,
  Johannes



On 09/15, Liwei Yang wrote:
> Good tips. Although I have used llvm-link to merge .bc files together, I
> guess -flto could optimize the resultant .bc file further.
> 
> As for the assembly, yes it is an issue. Anyway, I'll try to address those
> sources which are available for being translated into .bc first.
> 
> Thanks for your advice, Tim.
> 
> On Mon, Sep 15, 2014 at 2:55 PM, Tim Northover <t.p.northover at gmail.com>
> wrote:
> 
> > > If there's any automated way to infer about all the subroutines that one
> > > function needs, clang them into .bc file and link them into a stand-alone
> > > .bc library, that will be more than appreciated:-)
> >
> > If manually compiling the files is working for you, you could try
> > building the entire library with "-flto" for Link-time optimisation.
> > The output of that will be LLVM IR (if you can convince the build
> > system to do it for you).
> >
> > The issue is that parts of the standard library are
> > performance-critical and often written in assembly. If the entire file
> > is assembly you won't be able to produce IR very easily at all.
> >
> > Cheers.
> >
> > Tim.
> >
> 
> 
> 
> -- 
> Best Regards
> Liwei

> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 

Johannes Doerfert
Researcher / PhD Student

Compiler Design Lab (Prof. Hack)
Saarland University, Computer Science
Building E1.3, Room 4.26

Tel. +49 (0)681 302-57521 : doerfert at cs.uni-saarland.de
Fax. +49 (0)681 302-3065  : http://www.cdl.uni-saarland.de/people/doerfert
-------------- next part --------------
#!/usr/bin/python
import sys, os, subprocess

###############################################################################
############################# Helper functions ################################
###############################################################################
def center(string, columns = 80):
  length = len(string)
  return "%s%s%s" % (" " * int((columns + 1 - length) / 2),
                     string,
                     " " * int((columns + 0 - length) / 2))

def exists(full_path):
  return full_path and os.access(full_path, os.X_OK)

def no_ext(filename):
  return os.path.splitext(filename)[0]

def which(program):
  path = os.getenv('PATH')
  if not path:
    return None
  for sub_path in path.split(':'):
    full_path = os.path.join(sub_path, program)
    if exists(full_path):
      return full_path
  return None

def call(program, arguments):
  try:
    proc = subprocess.Popen([program] + arguments, stdout=sys.stdout, stderr=sys.stderr)
    retcode = proc.wait()
    if retcode:
      sys.stderr.write("\nERROR:\n------\n")
      sys.stderr.write("Got return code %s while executing:\n  %s %s\n\n" % (retcode, program, " ".join(arguments)))
      sys.exit(1)
    return retcode
  except Exception as e:
    sys.stderr.write("\nERROR:\n------\n")
    sys.stderr.write("Got an exception while executing:\n  %s %s\n%s\n\n" % (program, " ".join(arguments), repr(e)))
    sys.exit(1)

###############################################################################
############################# Global constants ################################
###############################################################################

# Enable DEBUG mode
DEBUG = True

# The name of this 'compiler'
NAME = "LLVM-Link-All"


# The environment variables with functions to derive their default values
env_variables = {
    'CLANG': ((lambda : which('clang')), True),
    'CLANG++': ((lambda : which('clang++')), True),
    'LLVM_LINK': ((lambda : which('llvm-link')), True),
  }

# Flag replacements
flag_replacements = {
    #'-O2' : '-O0',
    #'-O3' : '-O0',
    #'-g' : '',
    }


###############################################################################


if DEBUG:
  print ("\n%s" % (center(NAME)))

if DEBUG:
  print ("\n Check environment variables:")

variables = {}
for variable, default_pair in env_variables.items():
  val_inv, def_inv = "", ""
  default_fn, is_path = default_pair
  default = default_fn()
  value = os.getenv(variable, default)

  if is_path:
    if not exists(default):
      def_inv = " INVALID!"
    if not exists(value):
      val_inv = " INVALID!"

  variables[variable] = value
  if DEBUG:
    print ("   %-25s := %s%s \t\t(default: %s%s)" % (variable, value, val_inv, default, def_inv))

COMPILER = 'CLANG++' if sys.argv[0].endswith('++') else 'CLANG'

for variable, value in variables.items():
  if env_variables[variable][1] and not exists(value):
    sys.stderr.write("\nERROR:\n------\n")
    sys.stderr.write("The executable '%s' was not found! " % variable.lower())
    sys.stderr.write("The determined value was '%s'\n" % value)
    sys.stderr.write("Either put it on the 'PATH' or set the environment ")
    sys.stderr.write("variable '%s' pointing to the executable.\n\n" % variable)
    sys.exit(1)

arguments = sys.argv[1:]
if DEBUG:
  print ("\n Start parsing the command line:")
  print ("   '%s'" % (" ".join(arguments)))


if 'conftest.c' in arguments or '--version' in arguments:
  sys.stderr.write("\nCONFIGURE IS RUNNING:\n------\n")
  sys.stderr.write("Call %s (%s) %s\n" % (COMPILER, variables[COMPILER], ' '.join(arguments)))
  retcode = call(variables[COMPILER], arguments)
  if DEBUG:
    sys.stderr.write("     Retcode: %i\n" % retcode)
  sys.exit(retcode)

output_name = None
input_names = []
output_name_add_ending = False
output_kind = None


if DEBUG:
  print ("\n   Test for input files:")

skip_next   = False
for argument in arguments:
  if skip_next:
    skip_next = False
  elif '-o' == argument:
    skip_next = True
  elif '-I' == argument:
    skip_next = True
  elif not argument.startswith('-'):
    input_names.append(argument)

if DEBUG:
  print ("     Input files are '%s'" % (' '.join(input_names)))

if not input_names:
  sys.stderr.write("\nERROR:\n------\n")
  sys.stderr.write("No input files found\n\n")
  sys.stderr.write("Call %s (%s)\n" % (COMPILER, variables[COMPILER]))
  retcode = call(variables[COMPILER], arguments)
  if DEBUG:
    print ("     Retcode: %i" % retcode)
  sys.exit(retcode)


if DEBUG:
  print ("\n   Test for output file:")

assert(arguments.count('-o') < 2 and "Multiple occurrences of '-o'!")
if '-o' in arguments:
  index = arguments.index('-o')
  assert(len(arguments) > index + 1 and "-o was not followed by any value!")
  output_name = arguments[index + 1]
else:
  if len(input_names) > 1:
    output_name = 'a.out'
  else:
    output_name = input_names[0]
    output_name_add_ending = True

if DEBUG:
  print ("     Output file is '%s'" % (output_name))
  if output_name_add_ending:
    print ("     -- but the ending might need adjustment!")

if not output_name:
  sys.stderr.write("\nERROR:\n------\n")
  sys.stderr.write("No output file found\n\n")
  sys.stderr.write("Call %s (%s)\n" % (COMPILER, variables[COMPILER]))
  retcode = call(variables[COMPILER], arguments)
  if DEBUG:
    print ("     Retcode: %i" % retcode)
  sys.exit(retcode)


if DEBUG:
  print ("\n   Test for output kind:")

if '-c' in arguments:
  if DEBUG:
    print ("     An intermediate should be emitted!")
  if '-emit-llvm' in arguments:
    if DEBUG:
      print ("     It is already LLVM-IR ('-emit-llvm' is used)!")
  else:
    arguments.append('-emit-llvm')
    if DEBUG:
      print ("     Add '-emit-llvm' to emit LLVM-IR!")
  output_kind = 'ir'
else:
  if '-emit-llvm' in arguments:
    if DEBUG:
      print ("     It is already LLVM-IR ('-emit-llvm' is used)!")
    output_kind = 'ir'
  else:
    if DEBUG:
      print ("     An executable is emitted!")
    output_kind = 'ex'
    if output_name_add_ending:
      new_output_name = 'a.out'
      arguments[arguments.index(output_name)] = new_output_name
      output_name = new_output_name
      if DEBUG:
        print ("       Change output name to '%s'!" % (new_output_name))


if DEBUG:
  print ("\n   Replace common flags:")

no_replacements = 0
for index in range(len(arguments)):
  argument = arguments[index]
  if argument in flag_replacements:
    new_argument = flag_replacements[argument]
    arguments[index] = new_argument
    no_replacements += 1
    if DEBUG:
      print ("     Replace '%s' by '%s'!" % (argument, new_argument))

if DEBUG and no_replacements == 0:
  print ("     Nothing found to replace!")

if output_kind == 'ir':

  clang_arguments = arguments
  if DEBUG:
    print ("\n   Initiate CLANG (%s):" % variables['CLANG'])
    print ("     Options: '%s'" % ' '.join(clang_arguments))
  retcode = call(variables['CLANG'], clang_arguments)
  if DEBUG:
    print ("     Retcode: %i" % retcode)

elif output_kind == 'ex' and len(input_names) == 1:

  clang_output_name = no_ext(output_name) + '.bc'
  clang_arguments = arguments + ['-emit-llvm', '-c']
  clang_arguments[clang_arguments.index(output_name)] = clang_output_name
  if DEBUG:
    print ("\n   Initiate CLANG (%s):" % variables['CLANG'])
    print ("     Options: '%s'" % ' '.join(clang_arguments))
  retcode = call(variables['CLANG'], clang_arguments)
  if DEBUG:
    print ("     Retcode: %i" % retcode)

elif output_kind == 'ex' and len(input_names) > 1:

  linked_output_name = no_ext(output_name) + '-linked.bc'
  link_arguments = input_names + ['-o', linked_output_name]
  if DEBUG:
    print ("\n   Initiate LLVM_LINK (%s):" % variables['LLVM_LINK'])
    print ("     Options: '%s'" % ' '.join(link_arguments))
  retcode = call(variables['LLVM_LINK'], link_arguments)
  if DEBUG:
    print ("     Retcode: %i" % retcode)

else:
  assert(False and "Unknown output kind" and output_kind)


# vim: set ft=python
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 213 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140916/8e2ea61d/attachment.sig>


More information about the llvm-dev mailing list