[cfe-dev] Cross Translation Unit Support in Clang

Thu Jun 29 22:31:59 PDT 2017

On Thu, Jun 29, 2017, 9:23 PM Noah L via cfe-dev <cfe-dev at lists.llvm.org>
wrote:

> (Oops, I meant to start a new thread :) Thanks for your response.
>
> On Thu, Jun 29, 2017 at 12:01 AM, Gábor Horváth <xazax.hun at gmail.com>
> wrote:
> ...
>
>
>
>> Well, as far as I understand, the idiomatic way to handle this right now,
>> is not to apply your modifications to the source files right away. First
>> export all the changes for each translation units into a separate file, and
>> apply all the changes at once after the tool was run. If all your changes
>> are consistent, that should work well. If you would transform a header in a
>> different way in two different translation units that is a problem. I think
>> the best way to solve that right now to make sure all of your edits are
>> consistent.
>>
>
> So then I couldn't use clang::Rewriter to apply the modifications. I would
> have to somehow obtain the line and column numbers of the
> clang::SourceRange to be replaced and do it manually? Is that how other
> people do it?
>

Yes. Libtooling replacements are decoupled from the source manager fit that
reason and do that exact job.

> So, in my case the modifications are never contradictory, but each one may
> be insufficient in extent due to information not available in the
> translation unit. For example, one thing the tool does is modify the type
> declaration of a pointer depending on if a) it is being used as an array
> pointer/iterator, and b) if it is being used as a pointer to a dynamically
> sized array (i.e. basically if the array is, or could be, realloc()ed). You
> could imagine a situation where in one source file we see that a pointer
> target is realloc()ed, but we cannot determine whether or not the target is
> being used as an array (in which case no modification is made), and in
> another source file we can see that the target (of the same pointer) is
> (being used as) an array, but there is no indication of how the array was
> allocated (or realloc()ed). Without combining the observations from both
> source files, we cannot determine that the pointer target is a dynamically
> sized array (and hence won't be able to apply the (fully) correct
> modification).
>
> And analyzing the modifications resulting from each source file processed
> individually is not enough either (since the first case will not result in
> a modification). So I would have to export more information than just the
> modifications determined from processing each file individually.
>

> But in its most general form, that extra information is essentially the
> AST, no? (Or some subset of the AST that's relevant to me.) So basically
> you're suggesting that I export (some simplified version of) the ASTs to
> disk and combine them manually. And then analyze the combined (simplified)
> AST to come up with the correct modifications. Which I'd then have to apply
> manually.
>

The idea is to export just the information you need, keyed on the thing you
want to change. That way, you can fully parallelize both parsing and
postprocessing in a large code base.

> I'm not saying it's not feasible. I'm not even saying it's not reasonable.
> But you can see why it'd be nicer for me if libTooling could just present
> me with a combined AST. :)
>

Unfortunately that only scales to small projects in the generalized case.
The static analyzer gets away with it because it only loads things close to
the function it analyzes, instead of needing a global view.

>
>
>>
>>
>>>
>>> So this means to get the final converted version of the header file you
>>> have to merge the modifications made by each conversion operation. And
>>> sometimes the modifications are made on the same line so the merge tool
>>> can't do the merge automatically. (At least meld doesn't.) This is really
>>> annoying.
>>>
>>> Now, if libTooling were able to operate on the AST of the entire project
>>> at once this problem would go away. Or if you think the AST of the whole
>>> project would often be too big, then at least multiple specified
>>> translation units at a time would help.
>>>
>>
>> This is not the usecase this functionality was designed for. I think you
>> could definitely use it for something like that, but I think the solution I
>> mentioned above is superior right now. The main use case we wanted to cover
>> is the ability to gather some information from other translation units that
>> are required for your tool.
>>
>
> So, are you saying that the proposed changes would allow me, in a
> straightforward way, to obtain a (complete) AST spanning multiple
> translation units? (Where an element in a header file will have only one
> AST node even when the header file is included by multiple translation
> units?)
>
> Noah
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170630/4e82d790/attachment.html>