[llvm-commits] Patch for llvm-c to expose targetmachine & allow emit obj/asm files

Tue Apr 10 12:41:36 PDT 2012

(Including the list this time, sorry for the mess.)

On Fri, Apr 6, 2012 at 14:34, Carlo Kok <ck at remobjects.com> wrote:
> Attached is a fairly simple patch for llvm-c that exposes:
> * Target class
> * TargetMachine class
>
> To allow for emitting binary and assembly.
>
> This is the first time I try to contribute to llvm, if there's something
> wrong with my patch, let me know.

Hi Carlo,

Thanks for this! I've been annoyed in a project of my own that I
couldn't use the C bindings to emit objects or assembly.

I'm not very familiar with the classes you are wrapping, so someone
else should take a look too, but I have read through your patch and
made some comments below.

Thanks,
Hans

> +/*===-- llvm-c/TargetMachine.h - Target Machine Library C Interface - C++ -*-=*\

Shouldn't it be C rather than C++ at the top?

> +|* This header declares the C interface to libLLVMBitReader.a, which          *|
> +|* implements input of the LLVM bitcode format.                               *|
> +|*                                                                            *|
> +|* Many exotic languages can interoperate with C code but have a harder time  *|
> +|* with C++ due to name mangling. So in addition to C, this interface enables *|
> +|* tools written in such languages.                                           *|

I don't think this second paragraph is necessary?

> +typedef enum {
> +    LLVMRelocDefault 		= 0,
> +	LLVMRelocStatic 		= 1,
> +	LLVMRelocPIC			= 2,
> +	LLVMRelocDynamicNoPic   = 3

Looks like tabs have been used here instead of spaces.

Is there a reason that you set values for the enumerators here but not
in other enumerations such as LLVMCodeGenOptLevel?

> +// Target

Rather than this type of comments, I think proper Doxygen comments
would be awesome.

> +const char * LLVMGetTargetName(LLVMTargetRef T);
> +const char * LLVMGetTargetDescription(LLVMTargetRef T);

I think it's more common not to have a space between the * and the
function name.

> +// Target Machine
> +LLVMTargetMachineRef LLVMCreateTargetMachine(LLVMTargetRef T, char* Triple, char* CPU, char* Features, LLVMCodeGenOptLevel Level, LLVMRelocMode Reloc, LLVMCodeModel CodeModel);

The * should go on the variable, not the type. This applies throughout
the patch (below, and in the .cpp file as well).

> +#ifdef __cplusplus
Maybe put some newlines before this, to separate it from the API.

> +namespace llvm {
> +  class TargetMachine;
> +  class Target;
> +
Trailing space on this line?

> +  inline TargetMachine *unwrap(LLVMTargetMachineRef P) {
> +    return reinterpret_cast<TargetMachine *>(P);
I think it's more common to leave out the space between the type and
the * here. This applies in the other (un)wrap methods below as well.

> +//===-- TargetMachine.cpp ----------------------------------------------------------===//

This line looks too long.

> +LLVMTargetRef LLVMGetFirstTarget()
> +{

The opening curly brace usually goes on the same line as the function
name in LLVM. This applies throughout this file.

> +   const Target* target = &*TargetRegistry::begin();
> +   return wrap(target);
> +}

I think there should be a blank line between each functions. This
applies below as well.

> +// Target

I don't think this comment adds much value.

> +const char * LLVMGetTargetName(LLVMTargetRef T)

There's a trailing whitespace at the end of this line, and also for
the function declarations below.

> +LLVMTargetMachineRef LLVMCreateTargetMachine(LLVMTargetRef T, char* Triple, char* CPU, char* Features, LLVMCodeGenOptLevel Level, LLVMRelocMode Reloc, LLVMCodeModel CodeModel)
> +{
> +  Reloc::Model RM = (Reloc::Model)Reloc;
> +  CodeModel::Model CM = (CodeModel::Model)CodeModel;
> +  CodeGenOpt::Level OL = (CodeGenOpt::Level)Level;

I think it's more common to use static_cast here rather than c-style casts.

> +LLVMBool LLVMTargetMachineEmitToFile(LLVMTargetMachineRef T, LLVMModuleRef M, char* Filename, LLVMCodeGenFileType codegen, char** ErrorMessage)
> +{ // Inspired by the D binding
> +  TargetMachine* TM = unwrap(T);
> +  Module* Mod = unwrap(M);
> +
> +  FunctionPassManager pass(Mod);

Wouldn't it be better to use a PassManager here, that you can run over the
Module once, rather than iterating over each function?

> +  if (!td) {
> +    error = "No TargetData in TargetMachine";
> +	*ErrorMessage = strdup(error.c_str());
> +	return true;

Looks like tabs on the two previous lines.

> +  pass.add(new TargetData(*td));
> +

Trailing whitespace.

> +  TargetMachine::CodeGenFileType ft = (TargetMachine::CodeGenFileType)codegen;

static_cast instead of c-style cast

> +  raw_fd_ostream dest(Filename, error, raw_fd_ostream::F_Binary);
> +  formatted_raw_ostream destf(dest);
> +  if (!error.empty()) {
> +    *ErrorMessage = strdup(error.c_str());
> +	return true;

Tab

> +  }
> +

Trailing whitespace

> +  if (TM->addPassesToEmitFile(pass, destf, ft)) {
> +    error = "No TargetData in TargetMachine";
> +	*ErrorMessage = strdup(error.c_str());
> +	return true;

Tabs

> +  }
> +
> +  pass.doInitialization();
> +
> +  for (llvm::Module::iterator it = Mod->begin(), end = Mod->end(); it != end; ++it){

If you had a PassManager instead of a FunctionPassManager, you could just do
PM.run(*Mod);

> +    errs() << "Did 1 iteration";
> +    if (!it->isDeclaration()) {
> +	  errs() << "Did 1 proper iterations";
> +	  pass.run(*it);
> +	}

Tabs.

> +  }
> +  errs() << "Done";

You probably want to remove the printouts to errs :)