[cfe-dev] RFC: Easier AST Matching by Default

Sam McCall via cfe-dev cfe-dev at lists.llvm.org
Wed May 27 03:01:19 PDT 2020


On Wed, May 27, 2020 at 11:34 AM Stephen Kelly <steveire at gmail.com> wrote:

>
>
> On Wed 27 May 2020, 09:25 Sam McCall, <sammccall at google.com> wrote:
>
>> On Mon, May 25, 2020 at 11:56 PM Stephen Kelly via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> On 20/12/2019 21:01, Stephen Kelly via cfe-dev wrote:
>>> >
>>> > Hi,
>>> >
>>> > (Apologies if you receive this twice. GMail classified the first one
>>> as
>>> > spam)
>>> >
>>> > Aaron Ballman and I met by chance in Belfast and we discussed a way
>>> > forward towards making AST Matchers easier to use, particularly for
>>> C++
>>> > developers who are not familiar with the details of the Clang AST.
>>> >
>>> > For those unaware, I expanded on this in the EuroLLVM conference this
>>> > year, and then expanded on it at ACCU:
>>> >
>>> >   https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
>>> >
>>> > One step in the process of getting there is changing the default
>>> > behavior of AST Matchers to ignore invisible nodes while matching
>>> using
>>> > the C++ API, and while matching and dumping AST nodes in clang-query.
>>> >
>>> > I think this is the most important change in the entire proposal as it
>>> > sets out the intention of making the AST Matchers easier to use for
>>> C++
>>> > developers who are not already familiar with Clang APIs.
>>> >
>>> > To that end, I've written an AST to motivate the change:
>>> >
>>> >
>>> >
>>> https://docs.google.com/document/d/17Z6gAwwc3HoRXvsy0OdwU0X5MFQEuiGeSu3i6ICOB90
>>> >
>>> >
>>> > We're looking for feedback before pressing forward with the change. I
>>> > already have some patches written to port clang-tidy and unit tests to
>>> > account for the change of default.
>>>
>>>
>>> This change is now in master.
>>>
>>
>> Thanks for simplifying the matchers! Unfortunately I missed the
>> discussion on this.
>>
>
> That's a pity. This thread was bumped a few times since December.
>
Yeah, this is a personal problem of mine: cfe-dev has enough traffic and
covers too many topics that I find it too distracting to read continuously.
I try to check in occasionally, but missed it.


> Unfortunately this is causing widespread problems in downstream tools that
>> are difficult to track:
>>
>
> Sorry this is causing problems in downstream tools.
>
>  - this is a prominent API that is used in many places
>>  - the semantics have changed subtly but pervasively, so all usage sites
>> need to be audited, but because there was no API syntax change (e.g. build
>> break), it's not trivial to find all the affected locations to audit
>>
>
> Is it trivial to set the TraversalKind presumptively on the
> ParentMapContext early in the runtime of the downstream tool? The
> ParentMapContext is available via your ASTContext.
>
> I ask because reading your email it doesn't look like you considered this
> option? If you did consider it and rejected it, please say more about why
> it is not an option.
>
> That way you don't have to try to audit any matchers yet. You would just
> set the TraversalKind to AsIs in the entire tool.
>
Right, I should have addressed this.
The first problem is that we don't have a really solid way of identifying
all the downstream tools that are affected. There are certainly hundreds of
these.
Another problem is tools that mix matchers defined upstream with those
defined downstream, such as clang-tidy, where there are large numbers of
checks defined with various owners.

For this reason I'd been assuming the right workaround was to wrap the
matchers in traverse() everywhere. Maybe we should consider clang-tidy and
other tools as separate problems - if it's *just* clang-tidy then wrapping
all of the matchers might be manageable (with the bug fixed). And for other
tools I think we can probably flip the mode, if we can find all the tools
and find an appropriate place in the code to do so.


>  - the differences between old and new only occur when running against
>> certain code patterns that often aren't covered by tests, therefore each
>> audit is a lot of work
>>
>
> If that audit can be deferred, does the situation change?
>
Yes - for a start it's now safe to upgrade LLVM again :-)
We usually integrate LLVM head daily, and when this process gets blocked it
affects a large number of developers in unrelated areas (say, MLIR).

 - because the default was flipped globally, all usage sites must be dealt
>> with at once
>>
>
> Does this become manageable if you flip the setting locally in your tool
> in your ParentMapContext?
>
>  - the "revert-to-old-behaviour" is invasive at all usage sites (hurts
>> readability a lot) and apparently has bugs (D80606)
>>
>
> If changing the TraversalKind in the  ParentMapContext in your tool is a
> potential solution, then I can look into the above issue this morning.
>
> Please let me know.
>
+yitzhakm
Yitzhak was also looking into a fix here and may have ideas already.


> In summary: it requires changes in consuming tools that are numerous, hard
>> to find, hard to analyze, must all be done at once, and the changes aren't
>> mechanically reliable.
>>
>
> Hmm, the location to call setTraversalKind on the ParentMapContext in each
> external tool should be easy to find? Should be easy to analyze too. There
> is presumably fewer places in your tool (or just one place in the entire
> tool)  where ASTContexts are created, so "all" is either "1" or close to it?
>
> I expect this to bite people building tools against release versions to
>> hit the same problems in a few months.
>>
>
> Perhaps the solution of changing the setting in the ParentMapContext needs
> to be spelled out better in the release notes. It would also be good to
> know whether doing so makes the impact on your tools manageable. Can you
> look into that?
>
I think putting the solution in the release notes will hurt people not
paying attention in the same way I didn't spot this early enough on cfe-dev
(well, Dmitri did).
It certainly helps to have good documentation but subtly changing the
meaning of existing code is still dangerous.
I don't know the matcher API details well enough yet to have a concrete
suggestion, but I'm trying to get a better grasp on this.


> We should consider reverting this and finding a less-silent way to roll
>> out the change (like adding it with different syntax, or having no default
>> for a transition period covering a stable release).
>>
>
> Before doing that I'd like to find out more about whether the above
> solution works.
>
> Thanks,
>
> Stephen.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200527/43941002/attachment.html>


More information about the cfe-dev mailing list