[llvm-commits] [llvm] r47621 - in /llvm/trunk: include/llvm-c/lto.h tools/lto2/ tools/lto2/LTOCodeGenerator.cpp tools/lto2/LTOCodeGenerator.h tools/lto2/LTOModule.cpp tools/lto2/LTOModule.h tools/lto2/Makefile tools/lto2/lto.cpp tools/lto2/lto.exports
Chris Lattner
clattner at apple.com
Tue Feb 26 15:12:16 PST 2008
On Feb 26, 2008, at 12:26 PM, Nick Kledzik wrote:
> URL: http://llvm.org/viewvc/llvm-project?rev=47621&view=rev
> Log:
> first commit of new LTO system. It is not hooked up in the llvm/
> tools/Makefile, so no one will build it be default yet
Nice! Some details:
> +/**
> + * Returns the last error string or NULL if last operation was
> sucessful.
typo "sucessful"
>
> +
> +/**
> + * Returns the number of symbols in the object module.
> + */
> +extern unsigned int
> +lto_module_get_num_symbols(lto_module_t mod);
Minor details which you should feel to ignore: the "int" in "unsigned
int" is redundant, and the "extern" is also redundant. You could
choose to drop those if you care. We tend to not use them in the rest
of llvm, but it doesn't really matter either way.
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> --- llvm/trunk/tools/lto2/LTOCodeGenerator.cpp (added)
> +++ llvm/trunk/tools/lto2/LTOCodeGenerator.cpp Tue Feb 26 14:26:43
> 2008
>
> +
> +#include "llvm/Module.h"
> +#include "llvm/PassManager.h"
> +#include "llvm/Linker.h"
> +#include "llvm/Constants.h"
> +#include "llvm/DerivedTypes.h"
> +#include "llvm/ModuleProvider.h"
> +#include "llvm/Bitcode/ReaderWriter.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/FileUtilities.h"
> +#include "llvm/Support/SystemUtils.h"
> +#include "llvm/Support/Mangler.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/System/Program.h"
> +#include "llvm/System/Signals.h"
> +#include "llvm/Analysis/Passes.h"
> +#include "llvm/Analysis/LoopPass.h"
> +#include "llvm/Analysis/Verifier.h"
> +#include "llvm/CodeGen/FileWriters.h"
> +#include "llvm/Target/SubtargetFeature.h"
> +#include "llvm/Target/TargetOptions.h"
> +#include "llvm/Target/TargetData.h"
> +#include "llvm/Target/TargetMachine.h"
> +#include "llvm/Target/TargetMachineRegistry.h"
> +#include "llvm/Target/TargetAsmInfo.h"
> +#include "llvm/Transforms/IPO.h"
> +#include "llvm/Transforms/Scalar.h"
> +#include "llvm/Analysis/LoadValueNumbering.h"
> +#include "llvm/Support/MathExtras.h"
> +#include "llvm/Config/config.h"
> +
> +#include "LTOModule.h"
> +#include "LTOCodeGenerator.h"
You should #include LTOCodeGenerator.h first in the list to ensure
that it is self contained. Also, please remove any #includes that
aren't needed. There is some justification and more info for this here:
http://llvm.org/docs/CodingStandards.html#scf_includes
>
> +
> +#include <fstream>
> +#include <unistd.h>
> +#include <stdlib.h>
> +#include <fcntl.h>
For standard c++ headers like stdlib, please use <cstdlib>. Is it
possible to avoid unistd.h/fcntl.h with the llvm/System/Path.h or
MemoryBuffer.h stuff? MemoryBuffer.h is an efficient way to read an
entire file in off the disk.
> +void* LTOCodeGenerator::compile(size_t* length, std::string& errMsg)
> +{
> + // make unqiue temp .s file to put generated assembly code
typo "unqiue", also comments should end with proper punctuation and
use capital letters :)
>
> +
> + if ( !asmResult ) {
> + // read .o file into memory buffer
> + const sys::FileStatus* objStatus;
> + objStatus = uniqueObjPath.getFileStatus(false, &errMsg);
> + if ( objStatus != NULL ) {
> + *length = objStatus->getSize();
> + // use malloc() because caller will own this buffer and
> free() it
> + buffer = ::malloc(*length);
> + if ( buffer != NULL ) {
> + int fd = ::open(uniqueObjPath.c_str(), O_RDONLY, 0);
> + if ( fd != -1 ) {
> + // read object file contents into buffer
> + if ( ::read(fd, buffer, *length) !=
> (ssize_t)*length ) {
> + errMsg = "error reading object file";
> + free(buffer);
> + buffer = NULL;
I don't think this is correct: read can return partial results,
however it's best to use MemoryBuffer to avoid having to do this
explicitly.
>
> + }
> + close(fd);
> + }
> + else {
> + errMsg = "error opening object file";
> + free(buffer);
> + buffer = NULL;
> + }
> + }
> + else {
> + errMsg = "error mallocing space for object file";
> + }
> + }
> + else {
> + errMsg = "error stat'ing object file";
> + }
I think the code will end up getting simpler with MemoryBuffer, but if
that we're going to happen, I'd suggest using early exits to avoid
having the code be so nested. This would give you:
if (fd == -1) {
errMsg = "error opening object file";
return NULL;
}
for example, but this also requires using RAII to delete the files
etc. If you use MemoryBuffer, the code will likely be simple enough
that it's not worth doing this.
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> --- llvm/trunk/tools/lto2/LTOCodeGenerator.h (added)
> +++ llvm/trunk/tools/lto2/LTOCodeGenerator.h Tue Feb 26 14:26:43 2008
> + bool writeMergedModules(const char* path,
> std::string& errMsg);
This goes over 80 columns.
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> --- llvm/trunk/tools/lto2/LTOModule.cpp (added)
> +++ llvm/trunk/tools/lto2/LTOModule.cpp Tue Feb 26 14:26:43 2008
> @@ -0,0 +1,329 @@
> +//===-LTOModule.cpp - LLVM Link Time Optimizer
> ----------------------------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open
> Source
> +// License. See LICENSE.TXT for details.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +//
> +// This file implements the Link Time Optimization library. This
> library is
> +// intended to be used by linker to optimize code at link time.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +
> +#include "llvm/Module.h"
> +#include "llvm/PassManager.h"
> +#include "llvm/Linker.h"
> +#include "llvm/Constants.h"
> +#include "llvm/DerivedTypes.h"
> +#include "llvm/ModuleProvider.h"
> +#include "llvm/Bitcode/ReaderWriter.h"
> +#include "llvm/Support/CommandLine.h"
> +#include "llvm/Support/FileUtilities.h"
> +#include "llvm/Support/SystemUtils.h"
> +#include "llvm/Support/Mangler.h"
> +#include "llvm/Support/MemoryBuffer.h"
> +#include "llvm/System/Program.h"
> +#include "llvm/System/Path.h"
> +#include "llvm/System/Signals.h"
> +#include "llvm/Target/SubtargetFeature.h"
> +#include "llvm/Target/TargetOptions.h"
> +#include "llvm/Target/TargetData.h"
> +#include "llvm/Target/TargetMachine.h"
> +#include "llvm/Target/TargetMachineRegistry.h"
> +#include "llvm/Target/TargetAsmInfo.h"
> +#include "llvm/Transforms/IPO.h"
> +#include "llvm/Transforms/Scalar.h"
> +#include "llvm/Analysis/LoadValueNumbering.h"
> +#include "llvm/Support/MathExtras.h"
> +
> +#include "LTOModule.h"
As before, pruning this down would be good if possible. Please move
LTOModule.h to the top of the list.
> +bool LTOModule::isBitcodeFileForTarget(const void* mem,
> + size_t length, const char*
> triplePrefix)
> +{
> + bool result = false;
> + MemoryBuffer* buffer;
> + buffer = MemoryBuffer::getMemBuffer((char*)mem, (char*)mem
> +length);
> + if ( buffer != NULL ) {
> + ModuleProvider* mp = getBitcodeModuleProvider(buffer);
> + if ( mp != NULL ) {
> + std::string actualTarget = mp->getModule()-
> >getTargetTriple();
> + if ( strncmp(actualTarget.c_str(), triplePrefix,
> + strlen(triplePrefix)) == 0) {
> + result = true;
> + }
> + // mp destructor will delete buffer
> + delete mp;
> + }
> + else {
> + // if getBitcodeModuleProvider failed, we need to
> delete buffer
> + delete buffer;
> + }
> + }
> + return result;
> +}
Please use RAII and early exits to simplify the code. A useful thing
for this is llvm/ADT/OwningPtr.h. You should be able to do something
like this:
OwningPtr<MemoryBuffer> buffer = MemoryBuffer::getMemBuffer...
if (buffer == 0) return 0;
OwningPtr<ModuleProvider> mp = getBitcodeModuleProvider(buffer);
if (mp == 0) return 0;
...
This makes all the error cases be early exits and makes the success
case easier to understand. Also, no nesting! There are other methods
that can use similar treatment.
> +LTOModule::~LTOModule()
> +{
> + delete _module;
> + if ( _target != NULL )
> + delete _target;
> +}
delete accepts a null pointer, so you can just make the second delete
unconditional. It might also make sense to make the "_module" and
"_target" members of LTOModule be OwningPointers, which would mean you
don't have to write the dtor manually.
> +LTOModule* LTOModule::makeLTOModule(const char* path, std::string&
> errMsg)
> +{
> +LTOModule* LTOModule::makeLTOModule(const void* mem, size_t length,
> +
> std::string& errMsg)
> +{
These two methods share all the target matching code. Please pull it
out into a shared helper method (that takes a MemoryBuffer) or
something.
> +void LTOModule::addDefinedSymbol(GlobalValue* def, Mangler &mangler,
> + bool isFunction)
> +{
> + const char* symbolName
> = ::strdup(mangler.getValueName(def).c_str());
> +
> + // set alignment part log2() can have rounding errors
> + uint32_t align = def->getAlignment();
> + uint32_t attr = align ? __builtin_ctz(def->getAlignment()) : 0;
Please don't use __builtin_ctz(), use a helper from llvm/Support/
MathExtras.h.
> + // set definition part
> + if ( def->hasWeakLinkage() || def->hasLinkOnceLinkage() ) {
> + // lvm bitcode does not differenciate between weak def data
> + // and tentative definitions!
> + // HACK HACK HACK
> + // C++ does not use tentative definitions, but does use
> weak symbols
> + // so guess that anything that looks like a C++ symbol is
> weak and others
> + // are tentative definitions
Wow, that is a hack. :) We should really fix this at the LLVM IR
level, can you please file a radar that demonstrates how not
distinguishing between these two cases cause a problem?
> +uint32_t LTOModule::getSymbolCount()
> +{
> + if ( !_symbolsParsed ) {
> + _symbolsParsed = true;
> +
> + // Use mangler to add GlobalPrefix to names to match linker
> names.
> + Mangler mangler(*_module, _target->getTargetAsmInfo()-
> >getGlobalPrefix());
There is one other aspect of configuration for the mangler: On darwin
at least you have to call manger.setUseQuotes(true);
This tells it that it is ok to make some symbols use double quotes
(e.g. those with a space in them like objc methods) instead of
mangling the space. I don't know if we can currently get this from
TargetAsmInfo.h, but I think we should be able to. Dale or Evan can
help with this part.
>
> +
> + // add functions
> + for (Module::iterator f = _module->begin(); f != _module-
> >end(); ++f) {
> + if ( f->isDeclaration() ) {
> + addUndefinedSymbol(mangler.getValueName(f).c_str());
> + }
> + else {
To reduce nesting, I'd suggest writing this as:
> + for (Module::iterator f = _module->begin(); f != _module-
> >end(); ++f) {
> + if ( f->isDeclaration() ) {
> + addUndefinedSymbol(mangler.getValueName(f).c_str());
continue;
>
> + }
which unnests the 'else' part.
>
> + addDefinedSymbol(f, mangler, true);
> + // add external symbols referenced by this function.
> + for (Function::iterator b = f->begin(); b != f-
> >end(); ++b) {
> + for (BasicBlock::iterator i = b->begin();
> + i != b-
> >end(); ++i) {
> + for (unsigned count = 0, total = i-
> >getNumOperands();
> + count != total;
> ++count) {
> + findExternalRefs(i->getOperand(count),
> mangler);
> + }
> + }
> + }
I'd suggest splitting this triply nested loop out into its own helper
function.
>
> + }
> + }
> +
> + // add data
> + for (Module::global_iterator v = _module->global_begin(),
> + e = _module->global_end(); v !
> = e; ++v) {
> + if ( v->isDeclaration() ) {
> + addUndefinedSymbol(mangler.getValueName(v).c_str());
> + }
This can also use continue to reduce nesting.
>
> + // make symbols for all undefines
> + for (StringSet::iterator it=_undefines.begin();
> + it !=
> _undefines.end(); ++it) {
> + // if this symbol also has a definition, then don't
> make an undefine
> + // because it is a tentative definition
> + if ( _defines.find(it->getKeyData(), it->getKeyData()
> +it->getKeyLength()) == _defines.end() ) {
likewise: if (_defines.find(..) != _defines.end()) continue;
>
> +++ llvm/trunk/tools/lto2/LTOModule.h Tue Feb 26 14:26:43 2008
> @@ -0,0 +1,83 @@
> +//===-LTOModule.h - LLVM Link Time Optimizer
> ------------------------------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open
> Source
> +// License. See LICENSE.TXT for details.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +//
> +// This file declares the LTOModule class.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +
> +#ifndef LTO_MODULE_H
> +#define LTO_MODULE_H
> +
> +#include "llvm/Module.h"
> +#include "llvm/GlobalValue.h"
> +#include "llvm/Constants.h"
> +#include "llvm/Support/Mangler.h"
> +#include "llvm/Target/TargetMachine.h"
> +#include "llvm/ADT/StringMap.h"
I think that several of these #includes can be removed. GlobalValue.h
is redundant, and I think you can forward declare most of the classes
instead of #including their header.
> + bool _symbolsParsed;
> + std::vector<NameAndAttributes> _symbols;
> + StringSet _defines; // only needed to
> disambiguate tentative definitions
> + StringSet _undefines; // only needed to
> disambiguate tentative definitions
80 cols.
> +//===-lto.cpp - LLVM Link Time Optimizer
> ----------------------------------===//
> +//
> +// The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open
> Source
> +// License. See LICENSE.TXT for details.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +//
> +// This file implements the Link Time Optimization library. This
> library is
> +// intended to be used by linker to optimize code at link time.
> +//
> +//
> =
> =
> =
> ----------------------------------------------------------------------=
> ==//
> +
> +#include "llvm-c/lto.h"
> +
> +#include "LTOModule.h"
> +#include "LTOCodeGenerator.h"
> +
> +
> +// holds most recent error string
> +// *** not thread safe ***
> +static std::string sLastErrorString;
This also has a static ctor, which we try to avoid when possible (yes,
we don't avoid very well, but we still try :). Is there another way
to do this?
Thanks Nick!
-Chris
More information about the llvm-commits
mailing list