[cfe-dev] RFC: Easier AST Matching by Default

Wed May 27 07:24:05 PDT 2020

Hi Stephen,

First off: I think we've mitigated this enough on our end that we're not in
immediate need of changes beyond fixing the problems found in Transformers.
Thanks for being so engaged on this, it helped a lot clarifying our options
and I think we can help make this smoother for adopters when this gets
released.
I've also had a chance to sync internally with Dmitri, Haojian, Yitzhak and
Manuel and get their perspectives too.

Some brief notes on our local fixes:
Haojian wrote a (matcher/transformer-based!) tool to identify the "roots"
where matchers were passed into consuming clang APIs and mechanically
wrapped them with traverse() at that point.
This was applied across the codebase, whether needed to make tests pass or
not. We have no plans to systematically remove AsIs from these - tool
owners can do this (or not) as part of maintenance.
We chose this approach because:
 - there was preference for expressing this subtle difference next to the
matcher logic rather than "burying" it in tool framework stuff. Otherwise
there's little prospect of ever using the new mode, and copy/pasting
matchers would give mysteriously different results.
 - it was amenable to automated refactoring and thus scaling (more-so than
modifying tool entry points) because there were a few identifiable points
to instrument.
 - applying it to everything meant not worrying about whether tests were
adequate
 - updating all tools to use the new traversal mode probably isn't a good
use of time due to the long tail.

There were ~40 direct uses of ASTContext::getParents(), whose behaviour was
also changed, and these were significantly more work to adapt. (Apparently.
I wasn't the one doing the work :-))
I suppose the trivial/safe fix is to set and restore the mode before each
call, but this seems pretty gross - this doesn't seem like something that
should require mutating global state.

How do you feel about making the old behaviour and the
traversal-mode-respecting version of this function both reasonably
available, with matchers calling the new one?
I don't have a strong opinion on naming/structure, but would prefer
existing getParents() calls to either get the existing behaviour or a build
break forcing a choice.

We had some discussion of the direction of the change overall (I know it's
late for this). Making matchers easier is great. Some concern that you
really have to know the AST to use matchers effectively, and this might
give the illusion you only need to know syntax. But some pushback against
this idea: ignoreImplicit.... is a hassle even for experienced people.

Rest of this message is FYI only, written before syncing with others.

On Wed, May 27, 2020 at 12:42 PM Stephen Kelly <steveire at gmail.com> wrote:

>
>
> On Wed 27 May 2020, 11:01 Sam McCall, <sammccall at google.com> wrote:
>
>>
>> Unfortunately this is causing widespread problems in downstream tools
>>>> that are difficult to track:
>>>>
>>>
>>> Sorry this is causing problems in downstream tools.
>>>
>>>  - this is a prominent API that is used in many places
>>>>  - the semantics have changed subtly but pervasively, so all usage
>>>> sites need to be audited, but because there was no API syntax change (e.g.
>>>> build break), it's not trivial to find all the affected locations to audit
>>>>
>>>
>>> Is it trivial to set the TraversalKind presumptively on the
>>> ParentMapContext early in the runtime of the downstream tool? The
>>> ParentMapContext is available via your ASTContext.
>>>
>>> I ask because reading your email it doesn't look like you considered
>>> this option? If you did consider it and rejected it, please say more about
>>> why it is not an option.
>>>
>>> That way you don't have to try to audit any matchers yet. You would just
>>> set the TraversalKind to AsIs in the entire tool.
>>>
>> Right, I should have addressed this.
>> The first problem is that we don't have a really solid way of identifying
>> all the downstream tools that are affected.
>>
>
> Can you just change the setting for all of your downstream tools?
> (Excepting that clang-tidy might be a different beast in which you have to
> set it for each downstream check)
>
We don't have a great inventory of these: it's hundreds of tools written by
dozens of people. I'm just a guy trying to update the version of LLVM we
all depend on :-)

There are certainly hundreds of these.
>>
>
> Are all of these hundreds of tools independent? Or do they share some
> abstraction or common setup code or initialization code which lives in a
> library?
>
Most seem to use MatchFinder (237 construction sites). Many of these (~180)
use newFrontendActionFactory(MatchFinder), unfortunately these are both
upstream and so can't be modified.
Ultimately these get run by a ToolExecutor (also upstream) though I think
it's the FrontendActionFactory that needs the logic added.
The MatchFinders that aren't passed to newFrontendActionFactory tend to be
doing something complicated - e.g. nested in an outer action that may be a
MatchFinder (no further action needed) or something else.

I'm trying to find out whether the number of places in your code which need
> to be adjusted can be reduced from hundreds to a handful. Any early-enough
> place in the code which you own and which affects all or most of your tools
> would be a good start.
>
Yeah, unfortunately I haven't found one yet - the other constraint is it
needs to be late/internal enough to have access to the AST context.
Everything between argv and matcher callbacks tends to be upstream
framework code.

> Another problem is tools that mix matchers defined upstream with those
>> defined downstream, such as clang-tidy, where there are large numbers of
>> checks defined with various owners.
>>
>> For this reason I'd been assuming the right workaround was to wrap the
>> matchers in traverse() everywhere. Maybe we should consider clang-tidy and
>> other tools as separate problems -
>>
>
> I also suspect that would be helpful.
>
> if it's *just* clang-tidy then wrapping all of the matchers might be
>> manageable (with the bug fixed).
>>
>
> I have a proof of concept patch which adds a virtual method to
> ClangTidyCheck which returns an overriding optional<TraversalKind> to use
> for the check. If you reimplement that method in your downstream checks to
> return AsIs then you don't have to change individual matchers yet.
>
> What do you think?
>
> I have the patch on a computer I will have access to in about 2 hours.
>
I liked this idea, but in our sync nobody else did - too much
action-at-a-distance and too many ways to say the same thing.
So I think we probably wouldn't really use it, rather traverse().
Thanks though (again, I do personally like this approach)

>
> And for other tools I think we can probably flip the mode, if we can find
>> all the tools and find an appropriate place in the code to do so.
>>
>
> Yes.
>
>
>>
>>>  - the differences between old and new only occur when running against
>>>> certain code patterns that often aren't covered by tests, therefore each
>>>> audit is a lot of work
>>>>
>>>
>>> If that audit can be deferred, does the situation change?
>>>
>> Yes - for a start it's now safe to upgrade LLVM again :-)
>> We usually integrate LLVM head daily, and when this process gets blocked
>> it affects a large number of developers in unrelated areas (say, MLIR).
>>
>
> Sorry for the disruption. At least the issue is getting visibility here
> and feedback already indicates that the release notes could be more
> informative.
>
>
>
>>  - because the default was flipped globally, all usage sites must be
>>>> dealt with at once
>>>>
>>>
>>> Does this become manageable if you flip the setting locally in your tool
>>> in your ParentMapContext?
>>>
>>>  - the "revert-to-old-behaviour" is invasive at all usage sites (hurts
>>>> readability a lot) and apparently has bugs (D80606)
>>>>
>>>
>>> If changing the TraversalKind in the  ParentMapContext in your tool is a
>>> potential solution, then I can look into the above issue this morning.
>>>
>>> Please let me know.
>>>
>> +yitzhakm
>> Yitzhak was also looking into a fix here and may have ideas already.
>>
>>
>>> In summary: it requires changes in consuming tools that are numerous,
>>>> hard to find, hard to analyze, must all be done at once, and the changes
>>>> aren't mechanically reliable.
>>>>
>>>
>>> Hmm, the location to call setTraversalKind on the ParentMapContext in
>>> each external tool should be easy to find? Should be easy to analyze too.
>>> There is presumably fewer places in your tool (or just one place in the
>>> entire tool)  where ASTContexts are created, so "all" is either "1" or
>>> close to it?
>>>
>>> I expect this to bite people building tools against release versions to
>>>> hit the same problems in a few months.
>>>>
>>>
>>> Perhaps the solution of changing the setting in the ParentMapContext
>>> needs to be spelled out better in the release notes. It would also be good
>>> to know whether doing so makes the impact on your tools manageable. Can you
>>> look into that?
>>>
>> I think putting the solution in the release notes will hurt people not
>> paying attention in the same way I didn't spot this early enough on cfe-dev
>> (well, Dmitri did).
>>
>
> I'd think that at least if someone upgrades to a new release and their
> code breaks, the they would check the release notes. I realize that doesn't
> scale so we'll for those that "live at HEAD".
>
The problem is *not knowing* if your code breaks: we noticed because some
fairly small fraction of tools started failing tests. I suspect many of
the rest are being broken too, but you don't know until someone runs them
on a particular code pattern and notices.

This is a function of low test coverage: in practice, I don't think it's
feasible for such tools to have exhaustive test coverage when combining
with all C++ language constructs (at least it makes them cost-prohibitive
to develop).
Changes to AST shapes tend to cause similar sorts of breakages, just not
this many at once :-)
Upstream clang-tidy checks are fairly well tested, and IIUC there were
places this change didn't flag.

OTOH, it's important that we somehow get to a state where the best behavior
has the best API, and the worse behavior is hard to get to.

>
> It certainly helps to have good documentation but subtly changing the
>> meaning of existing code is still dangerous.
>> I don't know the matcher API details well enough yet to have a concrete
>> suggestion, but I'm trying to get a better grasp on this.
>>
>
> I agree that compile failures are better than runtime failures. We
> couldn't come up with a better solution.
>
> At least this runtime failure is likely to be "loud" where it occurs.
>
>
> Thanks,
>
> Stephen.
>
>
>>
>>> We should consider reverting this and finding a less-silent way to roll
>>>> out the change (like adding it with different syntax, or having no default
>>>> for a transition period covering a stable release).
>>>>
>>>
>>> Before doing that I'd like to find out more about whether the above
>>> solution works.
>>>
>>> Thanks,
>>>
>>> Stephen.
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200527/5799c46a/attachment-0001.html>