[cfe-commits] [PATCH] Implements support to run standalone tools

David Blaikie dblaikie at gmail.com
Tue Jan 31 09:35:32 PST 2012


On Tue, Jan 31, 2012 at 7:34 AM, nobled <nobled at dreamwidth.org> wrote:
> On Mon, Jan 30, 2012 at 2:49 PM, Manuel Klimek <klimek at google.com> wrote:
>> Ping + a re-based version of the patch.
>>
>
>
> This looks really cool. Aside from the create()/New() half-rename confusion,
> just some minor nits:
>
>> +/// \brief Returns a new FrontendActionFactory for any type that provides an
>> +/// implementation of NewFrontendAction().
>> +///
>> +/// FactoryT must implement: FrontendAction *NewFrontendAction().
>> +///
>> +/// Example:
>> +/// struct ProvidesFrontendActions {
>> +///   FrontendActionFactory *NewFrontendAction();
> You meant "FrontendAction *" here, right?
>
>> +/// } Factory;
>> +/// FrontendActionFactory *FactoryAdapter =
>> +///   newFrontendActionFactory(&Factory);
>> +template <typename FactoryT>
>> +FrontendActionFactory *newFrontendActionFactory(FactoryT *ActionFactory);
>> +
>> +/// \brief Runs (and deletes) the tool on 'Code' with the -fsynatx-only flag.
> syntax*
>
>> +///
>> +/// \param ToolAction The action to run over the code.
>> +/// \param Code C++ code.
>> +///
>> +/// \return - True if 'ToolAction' was successfully executed.
>> +bool runSyntaxOnlyToolOnCode(
>> +    clang::FrontendAction *ToolAction, llvm::StringRef Code);
>> +
>> +/// \brief Converts a vector<string> into a vector<char*> suitable to pass
>> +/// to main-style functions taking (int Argc, char *Argv[]).
>> +std::vector<char*> commandLineToArgv(const std::vector<std::string> *Command);
> This looks like it can just take an ArrayRef<const std::string>...

(Well, ArrayRef<std::string> actually - ArrayRefs are always over
const data (see a recent change in llvm by Chris to add
MutableArrayRef for other clients (& my discussion about using
specialization, etc instead)))

>> +/// \see JsonCompileCommandLineDatabase
>> +CompileCommand findCompileArgsInJsonDatabase(
>> +    llvm::StringRef FileName, llvm::StringRef JsonDatabase,
>> +    std::string &ErrorMessage);
> Just a note on the whole patch: you don't need the llvm:: prefix on StringRef
> or ArrayRef, or a few other common types in clang code anymore. You just have
> to include "clang/Basic/LLVM.h" to use them inside the clang namespace.
>
>
>> +  /// \brief Returns the file manager used in the tool.
>> +  ///
>> +  /// The file manager is shared between all translation units.
>> +  FileManager &getFiles() { return Files; }
>> +
>> + private:
>> +  /// \brief Add translation units to run the tool over.
>> +  ///
>> +  /// Translation units not found in JsonDatabaseDirectory (see constructore)
> constructor*
>
>> +  /// will be skipped.
>> +  void addTranslationUnits(
>> +      llvm::StringRef JsonDatabaseDirectory,
>> +      llvm::ArrayRef<std::string> TranslationUnits);
>> +
>> +  // We store command lines as pair (file name, command line).
>> +  typedef std::pair< std::string, std::vector<std::string> > CommandLine;
>> +  std::vector<CommandLine> CommandLines;
>> +
>> +  FileManager Files;
>> +  // Maps <file name> -> <file content>.
>> +  std::map<std::string, std::string> MappedFileContents;
>> +};
>> +
>> +template <typename T>
>> +FrontendActionFactory *newFrontendActionFactory() {
>> +  class SimpleFrontendActionFactory : public FrontendActionFactory {
>> +  public:
>> +    virtual clang::FrontendAction *New() { return new T; }
> You mean "create()"?
>
>> +  };
>> +
>> +  return new SimpleFrontendActionFactory;
>> +}
>> +
>> +template <typename FactoryT>
>> +FrontendActionFactory *newFrontendActionFactory(FactoryT *ActionFactory) {
>> +  class FrontendActionFactoryAdapter : public FrontendActionFactory {
>> +  public:
>> +    explicit FrontendActionFactoryAdapter(FactoryT *ActionFactory)
>> +      : ActionFactory(ActionFactory) {}
>> +
>> +    virtual clang::FrontendAction *New() {
> "create()" here, too... It looks like these templates aren't covered by the
> unittest/getting instantiated at all.
>
> diff --git a/lib/Tooling/CMakeLists.txt b/lib/Tooling/CMakeLists.txt
> new file mode 100644
> index 0000000..b0e1235
> --- /dev/null
>> +++ b/lib/Tooling/CMakeLists.txt
> @@ -0,0 +1,6 @@
>> +set(LLVM_LINK_COMPONENTS support)
>> +SET(LLVM_USED_LIBS clangBasic clangFrontend clangAST clangRewrite)
> Here and in unittest/Tooling/Makefile, you set a dependency on the Rewrite lib,
> but I don't see any Rewrite/ includes. Is there a reason for that?
>
>
>> +bool ToolInvocation::run() {
>> +  const std::vector<char*> Argv = commandLineToArgv(&CommandLine);
>> +  const char *const BinaryName = Argv[0];
>> +  DiagnosticOptions DefaultDiagnosticOptions;
>> +  TextDiagnosticPrinter DiagnosticPrinter(
>> +      llvm::errs(), DefaultDiagnosticOptions);
>> +  DiagnosticsEngine Diagnostics(llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs>(
>> +      new DiagnosticIDs()), &DiagnosticPrinter, false);
>> +
>> +  const llvm::OwningPtr<clang::driver::Driver> Driver(
>> +      newDriver(&Diagnostics, BinaryName));
>> +  // Since the input might only be virtual, don't check whether it exists.
>> +  Driver->setCheckInputsExist(false);
>> +  const llvm::OwningPtr<clang::driver::Compilation> Compilation(
>> +      Driver->BuildCompilation(llvm::ArrayRef<const char*>(
>> +          &Argv[0], Argv.size() - 1)));
> (The fact that the Driver entrypoint need a song and dance to produce
> a char * array
> is annoying, but that's obviously a pre-existing problem with the
> API... I'm hoping
> to fix that in the future.)
>
>> +ClangTool::ClangTool(int argc, char **argv)
>> +    : Files((FileSystemOptions())) {
>> +  if (argc < 3) {
>> +    llvm::outs() << "Usage: " << argv[0] << " <cmake-output-dir> "
>> +                 << "<file1> <file2> ...\n";
>> +    exit(1);
> This looks like a good place to use llvm::report_fatal_error() instead. Same
> for the two other exit(1) calls below.
>
>> +  }
>> +  addTranslationUnits(argv[1], std::vector<std::string>(argv + 2, argv + argc));
>> +}
>> +
>> +void ClangTool::addTranslationUnits(
>> +    llvm::StringRef JsonDatabaseDirectory,
>> +    llvm::ArrayRef<std::string> TranslationUnits) {
> TranslationUnits can just be an ArrayRef<StringRef> here.

& change the std::vector<std::string>(...) in the lines above to
std::vector<StringRef>

>> +  llvm::SmallString<1024> JsonDatabasePath(JsonDatabaseDirectory);
>> +  llvm::sys::path::append(JsonDatabasePath, "compile_commands.json");
>> +  llvm::OwningPtr<llvm::MemoryBuffer> JsonDatabase;
>> +  llvm::error_code Result =
>> +      llvm::MemoryBuffer::getFile(JsonDatabasePath, JsonDatabase);
>> +  if (Result != 0) {
>> +    llvm::outs() << "Error while opening JSON database: " << Result.message()
>> +                 << "\n";
>> +    exit(1);
>> +  }
>> +  llvm::StringRef BaseDirectory(::getenv("PWD"));
>> +  for (unsigned I = 0; I < TranslationUnits.size(); ++I) {
>> +    llvm::SmallString<1024> File(getAbsolutePath(
>> +        TranslationUnits[I], BaseDirectory));
>> +    std::string ErrorMessage;
>> +    clang::tooling::CompileCommand LookupResult =
>> +        clang::tooling::findCompileArgsInJsonDatabase(
>> +            File.str(), JsonDatabase->getBuffer(), ErrorMessage);
>> +    if (!ErrorMessage.empty()) {
>> +      llvm::outs() << "Error while parsing JSON database: " << ErrorMessage
>> +                   << "\n";
>> +      exit(1);
>> +    }
>> +    if (!LookupResult.CommandLine.empty()) {
>> +      if (!LookupResult.Directory.empty()) {
>> +        // FIXME: What should happen if CommandLine includes -working-directory
>> +        // as well?
>> +        LookupResult.CommandLine.push_back(
>> +            "-working-directory=" + LookupResult.Directory);
>> +      }
>> +      CommandLines.push_back(make_pair(File.str(), LookupResult.CommandLine));
>> +    } else {
>> +      // FIXME: There are two use cases here: doing a fuzzy
>> +      // "find . -name '*.cc' |xargs tool" match, where as a user I don't care
>> +      // about the .cc files that were not found, and the use case where I
>> +      // specify all files I want to run over explicitly, where this should
>> +      // be an error. We'll want to add an option for this.
>> +      llvm::outs() << "Skipping " << File << ". Command line not found.\n";
>> +    }
>> +  }
>> +}
>> +
>> +void ClangTool::mapVirtualFiles(
>> +    const std::map<std::string, std::string> &FileContents) {
> At first glance, MappedFileContents in both ToolInvocation and ClangTool
> looks like it can just be a std::map<StringRef,StringRef>. Does it really need
> to have ownership of its own malloc'd copy of the file contents?
>
>> +  MappedFileContents.insert(FileContents.begin(), FileContents.end());
>> +}
>> +
>> +int ClangTool::run(FrontendActionFactory *ActionFactory) {
> It also looks like the ClangTool class isn't covered by the unittest, either.
>
> Looking forward to this landing!
>
>
>> On Wed, Jan 25, 2012 at 3:59 PM, Manuel Klimek <klimek at google.com> wrote:
>>> On Tue, Jan 24, 2012 at 4:43 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>> On Tue, 2012-01-24 at 11:03 +0100, Manuel Klimek wrote:
>>>>> The attached patch adds support to run clang tools (FrontendActions)
>>>>> as standalone tools, or repeatedly in-memory in a process.
>>>>> This is useful for unit testing, map-reduce-style applications, source
>>>>> transformation daemons, and command line tools.
>>>>
>>>> I am also interested in having this kind of functionality. A few quick
>>>> comments:
>>>>
>>>> 1. The coding standards say that function names should begin with a
>>>> lower-case letter.
>>>
>>> Done. I jumped on the opportunity to dogfood refactoring support in
>>> our current tooling branch and wrote a script that changed all
>>> incorrectly named functions automatically (and created a sed-script to
>>> post-produce comment changes, which made me notice a bug in a
>>> comment).
>>>
>>>> 2. The comments contain several references to CMake; what, if anything,
>>>> in this patch is tied to CMake, or designed to be compatible with CMake?
>>>>
>>>> 2b.
>>>>
>>>>
>>>>> +/// \param JsonDatabase A JSON formatted list of compile commands.
>>>>> This lookup
>>>>> +/// command supports only a subset of the JSON standard as written by
>>>>> CMake.
>>>>>
>>>>
>>>> Please be more verbose here. What is not supported?
>>>>
>>>> Generally, I think that it would be helpful for you to provide a
>>>> paragraph or two explaining how this extension is to be used, what kind
>>>> of things can be specified in JSON inputs, how this ties into CMake (or
>>>> not), etc. with a few small examples. Some of this can be gleaned from
>>>> the test case, but some nicely-formatted text (without all of the
>>>> escaping) would, IMHO, be earlier to read.
>>>
>>> Hopefully better expressed now. Please let me know if you want more /
>>> different details.
>>>
>>> Thanks a lot for the review!
>>> /Manuel
>>>
>>>>
>>>>  -Hal
>>>>
>>>>>
>>>>> Cheers,
>>>>> /Manuel
>>>>>
>>>>> Rietveld link:
>>>>> http://codereview.appspot.com/5570054/
>>>>> _______________________________________________
>>>>> cfe-commits mailing list
>>>>> cfe-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>
>>>> --
>>>> Hal Finkel
>>>> Postdoctoral Appointee
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>>
>>
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits




More information about the cfe-commits mailing list