[cfe-dev] (LibTooling) (scan-build-py) Link commands in the CompilationDatabase JSON

Whisperity via cfe-dev cfe-dev at lists.llvm.org
Tue Mar 14 04:58:04 PDT 2017


Generally, I want the "(>=)three rows" version. Currently our tool
represents these the same way as compile commands are represented,
just the "file" isn't a source code but the object file.

So recording this:

  $ clang++ -c main.cpp a.cpp
  $ clang++ main.o a.o -o main.out

Should result in something like this:

  [
   {"directory": ".", "file": "main.cpp", "command": "clang++ -c main.cpp"},
   {"directory": ".", "file": "a.cpp", "command": "clang++ -c a.cpp"},
   {"directory": ".", "file": "main.o", "command": "clang++ main.o a.o
-o main.out"},
   {"directory": ".", "file": "a.o", "command": "clang++ main.o a.o -o
main.out"},
  ]

This, for my understanding, is compliant to how the file's
specification is: it is clearly visible that the source file for the
command is the object, and by parsing the command, we can be sure that
it was a link command, as long as we can do it concisely, it should be
fine. But I'm not sure which would be the -best available- way for
this.

But simply saying two one-liner commands which include linkage:

 // I purposefully omitted the "-o" argument.
 $ clang++ main.cpp 1.cpp
 $ clang++ main.cpp 2.cpp

Should also show that these files were compiled together. (Right now,
as I see, there is some deduplication happening. And without this
deduplication, clang-check and clang-tidy seems to fail.)

  [
   {"directory": ".", "file": "main.cpp", "command": "clang++ main.cpp 1.cpp"},
   {"directory": ".", "file": "main.cpp", "command": "clang++ main.cpp 2.cpp"},
   {"directory": ".", "file": "1.cpp", "command": "clang++ main.cpp 1.cpp"},
   {"directory": ".", "file": "2.cpp", "command": "clang++ main.cpp 2.cpp"},
  ]

This is why I originally wanted and I still want this to NOT be the
default behaviour, but something that is triggered by a switch. If
your tool can support the understanding of linker commands in the
build.json, you flick this switch, and then you (can) expect an output
that contains link commands. Other tools simply not flick the switch,
and retain the "purely compilation commands only" view.

So what I want to see is the linkage graph represented in the file. By
parsing the file and examining the data in it, I want to be able to
model which translation units and objects were linked together when
the project was built.

2017-03-14 12:12 GMT+01:00 Laszlo Nagy <rizsotto.mailinglist at gmail.com>:
> Hi Whisperity,
>
> I thought I understood your question on github, but this email is confusing
> me... Can I ask simple questions to clarify a few things?
>
> Your example is not a valid compilation. Did you try them?
>
>   $ clang -c a.c b.c -o ab.o
>   clang-3.8: error: cannot specify -o when generating multiple output files
>
>   $ clang a.c b.c -o ab.o
>
> The second one compiles iff `a.c` or `b.c` contains `main` implementation.
> Then `ab.o` becomes not an object file, but an executable. So, that's
> already a linking!
>
> To have duplicated entries in compilation database are not problem. So, if
> you have the same module multiple times, that's just fine.
>
>   $ clang -c a.c -o a.o
>   $ clang -c a.c -Dkey=value -o a.o
>
> will result two entries where the "file" attribute is the same.
>
>   $ clang a.c b.c -o ab
>
> As previously explained this is two compilation and one linking. Current
> tools will generate a compilation database with two entries only.
>
>   [
>    {"directory": ".", "file": "a.c", "command": "cc -c a.c"},
>    {"directory": ".", "file": "b.c", "command": "cc -c b.c"},
>   ]
>
> My understanding was earlier that you want this to be a three element list.
> Is that correct? Or you want a single element list?
>
> Or even simpler, shall we make an entry for this too?
>
>   $ clang a.o b.o -o ab
>
> But then, shall we record linker commands like this?
>
>   $ ld a.o b.o -o ab
>
> Or even this?
>
>   $ ar crf lib.a a.o b.o
>
> How would you represent these commands in the JSON compilation database?
>
> Because my main point was, you need to define this first (via this list,
> with consent and implementation support) to have the tools (cmake,
> intercept-build, etc...) to generate the desired output.
>
> Regards,
> Laszlo
>
> On Tue, Mar 14, 2017 at 8:14 AM, Whisperity via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
>>
>> Dear Members,
>> Dear Manuel Klimek and László Nagy (rizsotto),
>>
>> I'm resurrecting an older discussion
>>
>> (http://clang-developers.42468.n3.nabble.com/compilation-db-question-td4054364.html)
>> and replying upon my request to include link commands in the
>> intercept-build's build.json
>> (https://github.com/rizsotto/scan-build/issues/80). Here at Ericsson
>> we develop the tools CodeChecker and CodeCompass. We use the
>> compilation commands JSON to get the information we need, but neither
>> CMake's generated database (related discussion:
>>
>> http://clang-developers.42468.n3.nabble.com/Extending-CMAKE-EXPORT-COMPILE-COMMANDS-td4024793.html)
>> nor those produced by intercept-build contain linkage commands, which,
>> in certain cases, our tools need.
>> For this reason, we have been supplying our own interceptor, LD-LOGGER
>> (https://github.com/Ericsson/codechecker/tree/master/vendor/build-logger),
>> but it's messy and unmaintained as of now.
>>
>> This is why the request to rizsotto's project has been posted, and he
>> pointed me in the direction of Clang, but I'd like to get some
>> pointers before I delve into changing the code.
>> I've tried some dummy build.jsons and scenarios with the current
>> (today's morning UTC) LibTooling projects such as clang-check and
>> clang-tidy.
>>
>> Let's consider an example simple project, which is compiled (clang++
>> -c a.cpp) and then linked (clang++ a.o -o main.out). This will write
>> TWO entries into the build.json, one with the compile and one with the
>> linker command, and libtooling programs work with it perfectly, the
>> link command (valid as per the Compilation Command Database
>> specification) is not causing any mayhem.
>>
>> Now consider the following build commands in the project.
>>
>>     clang++ a.cpp b1.cpp -o ab1.o
>>     clang++ a.cpp b2.cpp -o ab2.o
>>     clang++ ab1.o c.cpp -o one.out
>>     clang++ ab2.o d.cpp -o two.out
>>
>> If this is logged, either by our tool (with the linkage commands) or
>> via intercept-build, or -even- if I create a valid build.json for this
>> project in an editor, the tools clang-tidy and clang-check fail with
>> the error
>>      error: unable to handle compilation, expected exactly one compiler
>> job
>>
>> Which is understandable, because as of now, a.cpp exists twice in the
>> compile commands. Actually there are four lines, two with a.cpp as
>> file, and one-one with b1 and b2, but only two commands are
>> duplicated. Which is the expected result, seeing how the project is
>> built in our example. (This, to my understanding, fits the
>> specification of a CCDb.)
>>
>> My questions are:
>>
>> 1. Is the "only one compiler job" an expectation only standing in
>> tools like clang-tidy and clang-check who want to "query" the proper
>> compilation commandline from the build.json and fail into ambiguity if
>> there are more, or is this a more general expectation?
>>
>> 2. Rizsotto said, and I quote
>>
>> "But very little (or none) support for it in the current Clang tooling
>> library. (I would call the compilation database parser in Clang very
>> picky/strict.)
>> Currently I'm busy to merge this code into Clang repository. Would not
>> implement this feature now. [...] I can put more effort into it, when
>> there is a more generic driver from Clang side too. As far as I can
>> see Manuel (one of the guy behind Clang tooling) is supporting it, but
>> lack resource to implement it. (Be the change you want! ;))"
>>
>> Assuming that I implement logging the build commands into
>> intercept-build (or Bear), which are the crucial Clang parts which I
>> should expect to be broken by the fact that linker commands are in the
>> database? Should there be a filter somewhere, in some project of
>> Clang, which filters the link commands on some criteria? (In our
>> tools, we implemented rules based on which we decided whether or not
>> an entry in ld-logger's output is a compile or a link command.)
>>
>> As seen above, to my current understanding, having link commands does
>> not make LibTooling's head spin around --- but having the same file
>> referenced multiple time does, at least for some tools.
>>
>> 3. (This is more directed at Manuel)
>> Did the thought train move forward since November? What is the current
>> consensus on this approach? We would like to increase our tools'
>> support for what is generally used and more maintained in the
>> community.
>>
>>
>> Best regards,
>> Whisperity.
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>



More information about the cfe-dev mailing list