[cfe-commits] [PATCH] Implements support to run standalone tools
nobled
nobled at dreamwidth.org
Tue Jan 31 13:20:25 PST 2012
On Tue, Jan 31, 2012 at 3:07 PM, Manuel Klimek <klimek at google.com> wrote:
> On Tue, Jan 31, 2012 at 8:31 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> On Tue, Jan 31, 2012 at 11:16 AM, Manuel Klimek <klimek at google.com> wrote:
>>> On Tue, Jan 31, 2012 at 4:34 PM, nobled <nobled at dreamwidth.org> wrote:
>>>> On Mon, Jan 30, 2012 at 2:49 PM, Manuel Klimek <klimek at google.com> wrote:
>>>>> Ping + a re-based version of the patch.
>>>>>
>>>>
>>>>
>>>> This looks really cool. Aside from the create()/New() half-rename confusion,
>>>> just some minor nits:
>>>>
>>>>> +/// \brief Returns a new FrontendActionFactory for any type that provides an
>>>>> +/// implementation of NewFrontendAction().
>>>>> +///
>>>>> +/// FactoryT must implement: FrontendAction *NewFrontendAction().
>>>>> +///
>>>>> +/// Example:
>>>>> +/// struct ProvidesFrontendActions {
>>>>> +/// FrontendActionFactory *NewFrontendAction();
>>>> You meant "FrontendAction *" here, right?
>>>
>>> Yep, done. Thx for the catch.
>>>
>>>>> +/// } Factory;
>>>>> +/// FrontendActionFactory *FactoryAdapter =
>>>>> +/// newFrontendActionFactory(&Factory);
>>>>> +template <typename FactoryT>
>>>>> +FrontendActionFactory *newFrontendActionFactory(FactoryT *ActionFactory);
>>>>> +
>>>>> +/// \brief Runs (and deletes) the tool on 'Code' with the -fsynatx-only flag.
>>>> syntax*
>>>
>>> Done.
>>>
>>>>> +///
>>>>> +/// \param ToolAction The action to run over the code.
>>>>> +/// \param Code C++ code.
>>>>> +///
>>>>> +/// \return - True if 'ToolAction' was successfully executed.
>>>>> +bool runSyntaxOnlyToolOnCode(
>>>>> + clang::FrontendAction *ToolAction, llvm::StringRef Code);
>>>>> +
>>>>> +/// \brief Converts a vector<string> into a vector<char*> suitable to pass
>>>>> +/// to main-style functions taking (int Argc, char *Argv[]).
>>>>> +std::vector<char*> commandLineToArgv(const std::vector<std::string> *Command);
>>>> This looks like it can just take an ArrayRef<const std::string>...
>>>
>>> Done.
Oops, sorry for confusing you with the const thing. My bad.
>>>
>>>>> +/// \see JsonCompileCommandLineDatabase
>>>>> +CompileCommand findCompileArgsInJsonDatabase(
>>>>> + llvm::StringRef FileName, llvm::StringRef JsonDatabase,
>>>>> + std::string &ErrorMessage);
>>>> Just a note on the whole patch: you don't need the llvm:: prefix on StringRef
>>>> or ArrayRef, or a few other common types in clang code anymore. You just have
>>>> to include "clang/Basic/LLVM.h" to use them inside the clang namespace.
>>>
>>> Cool, done.
>>>
>>>>> + /// \brief Returns the file manager used in the tool.
>>>>> + ///
>>>>> + /// The file manager is shared between all translation units.
>>>>> + FileManager &getFiles() { return Files; }
>>>>> +
>>>>> + private:
>>>>> + /// \brief Add translation units to run the tool over.
>>>>> + ///
>>>>> + /// Translation units not found in JsonDatabaseDirectory (see constructore)
>>>> constructor*
>>>
>>> Done.
>>>
>>>>> + /// will be skipped.
>>>>> + void addTranslationUnits(
>>>>> + llvm::StringRef JsonDatabaseDirectory,
>>>>> + llvm::ArrayRef<std::string> TranslationUnits);
>>>>> +
>>>>> + // We store command lines as pair (file name, command line).
>>>>> + typedef std::pair< std::string, std::vector<std::string> > CommandLine;
>>>>> + std::vector<CommandLine> CommandLines;
>>>>> +
>>>>> + FileManager Files;
>>>>> + // Maps <file name> -> <file content>.
>>>>> + std::map<std::string, std::string> MappedFileContents;
>>>>> +};
>>>>> +
>>>>> +template <typename T>
>>>>> +FrontendActionFactory *newFrontendActionFactory() {
>>>>> + class SimpleFrontendActionFactory : public FrontendActionFactory {
>>>>> + public:
>>>>> + virtual clang::FrontendAction *New() { return new T; }
>>>> You mean "create()"?
>>>
>>> Yep, missing tests... Added.
>>>
>>>>> + };
>>>>> +
>>>>> + return new SimpleFrontendActionFactory;
>>>>> +}
>>>>> +
>>>>> +template <typename FactoryT>
>>>>> +FrontendActionFactory *newFrontendActionFactory(FactoryT *ActionFactory) {
>>>>> + class FrontendActionFactoryAdapter : public FrontendActionFactory {
>>>>> + public:
>>>>> + explicit FrontendActionFactoryAdapter(FactoryT *ActionFactory)
>>>>> + : ActionFactory(ActionFactory) {}
>>>>> +
>>>>> + virtual clang::FrontendAction *New() {
>>>> "create()" here, too... It looks like these templates aren't covered by the
>>>> unittest/getting instantiated at all.
>>>
>>> Yep, correct. This is tested by other tests in our branch, but it
>>> makes sense to test them independently as part of this layer, too.
>>> Added a unit test.
>>>
>>>> diff --git a/lib/Tooling/CMakeLists.txt b/lib/Tooling/CMakeLists.txt
>>>> new file mode 100644
>>>> index 0000000..b0e1235
>>>> --- /dev/null
>>>>> +++ b/lib/Tooling/CMakeLists.txt
>>>> @@ -0,0 +1,6 @@
>>>>> +set(LLVM_LINK_COMPONENTS support)
>>>>> +SET(LLVM_USED_LIBS clangBasic clangFrontend clangAST clangRewrite)
>>>> Here and in unittest/Tooling/Makefile, you set a dependency on the Rewrite lib,
>>>> but I don't see any Rewrite/ includes. Is there a reason for that?
>>>
>>> I forgot to rip them out when I separated the dependencies from the
>>> refactoring library in our branch (upcoming super awesome feature ;).
>>> Removed.
The other one is still there:
diff --git a/unittests/Tooling/Makefile b/unittests/Tooling/Makefile
new file mode 100644
index 0000000..4c5ba76
--- /dev/null
+++ b/unittests/Tooling/Makefile
@@ -0,0 +1,17 @@
+##===- unittests/Tooling/Makefile --------------------------*-
Makefile -*-===##
+#
+# The LLVM Compiler Infrastructure
+#
+# This file is distributed under the University of Illinois Open Source
+# License. See LICENSE.TXT for details.
+#
+##===----------------------------------------------------------------------===##
+
+CLANG_LEVEL = ../..
+TESTNAME = Tooling
+LINK_COMPONENTS := support mc
+USEDLIBS = clangTooling.a clangFrontend.a clangSerialization.a clangDriver.a \
+ clangRewrite.a clangParse.a clangSema.a clangAnalysis.a \
+ clangAST.a clangLex.a clangBasic.a
+
+include $(CLANG_LEVEL)/unittests/Makefile
>>>
>>>>> +bool ToolInvocation::run() {
>>>>> + const std::vector<char*> Argv = commandLineToArgv(&CommandLine);
>>>>> + const char *const BinaryName = Argv[0];
>>>>> + DiagnosticOptions DefaultDiagnosticOptions;
>>>>> + TextDiagnosticPrinter DiagnosticPrinter(
>>>>> + llvm::errs(), DefaultDiagnosticOptions);
>>>>> + DiagnosticsEngine Diagnostics(llvm::IntrusiveRefCntPtr<clang::DiagnosticIDs>(
>>>>> + new DiagnosticIDs()), &DiagnosticPrinter, false);
>>>>> +
>>>>> + const llvm::OwningPtr<clang::driver::Driver> Driver(
>>>>> + newDriver(&Diagnostics, BinaryName));
>>>>> + // Since the input might only be virtual, don't check whether it exists.
>>>>> + Driver->setCheckInputsExist(false);
>>>>> + const llvm::OwningPtr<clang::driver::Compilation> Compilation(
>>>>> + Driver->BuildCompilation(llvm::ArrayRef<const char*>(
>>>>> + &Argv[0], Argv.size() - 1)));
>>>> (The fact that the Driver entrypoint need a song and dance to produce
>>>> a char * array
>>>> is annoying, but that's obviously a pre-existing problem with the
>>>> API... I'm hoping
>>>> to fix that in the future.)
>>>>
>>>>> +ClangTool::ClangTool(int argc, char **argv)
>>>>> + : Files((FileSystemOptions())) {
>>>>> + if (argc < 3) {
>>>>> + llvm::outs() << "Usage: " << argv[0] << " <cmake-output-dir> "
>>>>> + << "<file1> <file2> ...\n";
>>>>> + exit(1);
>>>> This looks like a good place to use llvm::report_fatal_error() instead. Same
>>>> for the two other exit(1) calls below.
>>>
>>> Done.
>>>
>>>>
>>>>> + }
>>>>> + addTranslationUnits(argv[1], std::vector<std::string>(argv + 2, argv + argc));
>>>>> +}
>>>>> +
>>>>> +void ClangTool::addTranslationUnits(
>>>>> + llvm::StringRef JsonDatabaseDirectory,
>>>>> + llvm::ArrayRef<std::string> TranslationUnits) {
>>>> TranslationUnits can just be an ArrayRef<StringRef> here.
>>>
>>> We're handing in a vector<string> which we created from argv, argc in
>>> the ClangTool constructor. Is there a magic way to create a
>>> ArrayRef<StringRef> from a vector<string> that can't figure out?
>>
>> I think the implication is that you'd create a vector<StringRef>
>> instead, to match the ArrayRef<StringRef> parameter. Does that
>> work/make sense?
>
> Heh, yea, that obviously makes sense... I should go home now, it's
> getting late :)
>
> Please find the updated patch attached.
>
> Cheers,
> /Manuel
>
>>
>>>
>>>>> + llvm::SmallString<1024> JsonDatabasePath(JsonDatabaseDirectory);
>>>>> + llvm::sys::path::append(JsonDatabasePath, "compile_commands.json");
>>>>> + llvm::OwningPtr<llvm::MemoryBuffer> JsonDatabase;
>>>>> + llvm::error_code Result =
>>>>> + llvm::MemoryBuffer::getFile(JsonDatabasePath, JsonDatabase);
>>>>> + if (Result != 0) {
>>>>> + llvm::outs() << "Error while opening JSON database: " << Result.message()
>>>>> + << "\n";
>>>>> + exit(1);
>>>>> + }
>>>>> + llvm::StringRef BaseDirectory(::getenv("PWD"));
>>>>> + for (unsigned I = 0; I < TranslationUnits.size(); ++I) {
>>>>> + llvm::SmallString<1024> File(getAbsolutePath(
>>>>> + TranslationUnits[I], BaseDirectory));
>>>>> + std::string ErrorMessage;
>>>>> + clang::tooling::CompileCommand LookupResult =
>>>>> + clang::tooling::findCompileArgsInJsonDatabase(
>>>>> + File.str(), JsonDatabase->getBuffer(), ErrorMessage);
>>>>> + if (!ErrorMessage.empty()) {
>>>>> + llvm::outs() << "Error while parsing JSON database: " << ErrorMessage
>>>>> + << "\n";
>>>>> + exit(1);
>>>>> + }
>>>>> + if (!LookupResult.CommandLine.empty()) {
>>>>> + if (!LookupResult.Directory.empty()) {
>>>>> + // FIXME: What should happen if CommandLine includes -working-directory
>>>>> + // as well?
>>>>> + LookupResult.CommandLine.push_back(
>>>>> + "-working-directory=" + LookupResult.Directory);
>>>>> + }
>>>>> + CommandLines.push_back(make_pair(File.str(), LookupResult.CommandLine));
>>>>> + } else {
>>>>> + // FIXME: There are two use cases here: doing a fuzzy
>>>>> + // "find . -name '*.cc' |xargs tool" match, where as a user I don't care
>>>>> + // about the .cc files that were not found, and the use case where I
>>>>> + // specify all files I want to run over explicitly, where this should
>>>>> + // be an error. We'll want to add an option for this.
>>>>> + llvm::outs() << "Skipping " << File << ". Command line not found.\n";
>>>>> + }
>>>>> + }
>>>>> +}
>>>>> +
>>>>> +void ClangTool::mapVirtualFiles(
>>>>> + const std::map<std::string, std::string> &FileContents) {
>>>> At first glance, MappedFileContents in both ToolInvocation and ClangTool
>>>> looks like it can just be a std::map<StringRef,StringRef>. Does it really need
>>>> to have ownership of its own malloc'd copy of the file contents?
>>>
>>> An excellent point. Changed.
>>>
>>>>> + MappedFileContents.insert(FileContents.begin(), FileContents.end());
>>>>> +}
>>>>> +
>>>>> +int ClangTool::run(FrontendActionFactory *ActionFactory) {
>>>> It also looks like the ClangTool class isn't covered by the unittest, either.
>>>
>>> I added ClangCheck.cpp and an integration test. I first wanted to keep
>>> this change smaller, but leaving out clang-check and an integration
>>> test was probably a little over the top...
>>>
>>>> Looking forward to this landing!
>>>
>>> Me too :) Thanks a lot for reviewing!
>>>
>>> Cheers,
>>> /Manuel
>>>
>>>>
>>>>
>>>>> On Wed, Jan 25, 2012 at 3:59 PM, Manuel Klimek <klimek at google.com> wrote:
>>>>>> On Tue, Jan 24, 2012 at 4:43 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>>>>> On Tue, 2012-01-24 at 11:03 +0100, Manuel Klimek wrote:
>>>>>>>> The attached patch adds support to run clang tools (FrontendActions)
>>>>>>>> as standalone tools, or repeatedly in-memory in a process.
>>>>>>>> This is useful for unit testing, map-reduce-style applications, source
>>>>>>>> transformation daemons, and command line tools.
>>>>>>>
>>>>>>> I am also interested in having this kind of functionality. A few quick
>>>>>>> comments:
>>>>>>>
>>>>>>> 1. The coding standards say that function names should begin with a
>>>>>>> lower-case letter.
>>>>>>
>>>>>> Done. I jumped on the opportunity to dogfood refactoring support in
>>>>>> our current tooling branch and wrote a script that changed all
>>>>>> incorrectly named functions automatically (and created a sed-script to
>>>>>> post-produce comment changes, which made me notice a bug in a
>>>>>> comment).
>>>>>>
>>>>>>> 2. The comments contain several references to CMake; what, if anything,
>>>>>>> in this patch is tied to CMake, or designed to be compatible with CMake?
>>>>>>>
>>>>>>> 2b.
>>>>>>>
>>>>>>>
>>>>>>>> +/// \param JsonDatabase A JSON formatted list of compile commands.
>>>>>>>> This lookup
>>>>>>>> +/// command supports only a subset of the JSON standard as written by
>>>>>>>> CMake.
>>>>>>>>
>>>>>>>
>>>>>>> Please be more verbose here. What is not supported?
>>>>>>>
>>>>>>> Generally, I think that it would be helpful for you to provide a
>>>>>>> paragraph or two explaining how this extension is to be used, what kind
>>>>>>> of things can be specified in JSON inputs, how this ties into CMake (or
>>>>>>> not), etc. with a few small examples. Some of this can be gleaned from
>>>>>>> the test case, but some nicely-formatted text (without all of the
>>>>>>> escaping) would, IMHO, be earlier to read.
>>>>>>
>>>>>> Hopefully better expressed now. Please let me know if you want more /
>>>>>> different details.
>>>>>>
>>>>>> Thanks a lot for the review!
>>>>>> /Manuel
>>>>>>
>>>>>>>
>>>>>>> -Hal
>>>>>>>
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> /Manuel
>>>>>>>>
>>>>>>>> Rietveld link:
>>>>>>>> http://codereview.appspot.com/5570054/
>>>>>>>> _______________________________________________
>>>>>>>> cfe-commits mailing list
>>>>>>>> cfe-commits at cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>>>
>>>>>>> --
>>>>>>> Hal Finkel
>>>>>>> Postdoctoral Appointee
>>>>>>> Leadership Computing Facility
>>>>>>> Argonne National Laboratory
>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cfe-commits mailing list
>>>>> cfe-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>>>
>>>
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>>>
More information about the cfe-commits
mailing list