[Lldb-commits] [clang] [clang-tools-extra] [lldb] [clang] Reland: Instantiate alias templates with sugar (PR #101858)

Matheus Izvekov via lldb-commits lldb-commits at lists.llvm.org
Wed Aug 7 13:49:56 PDT 2024


mizvekov wrote:

> We keep track of propagated and inferred nullability annotations in a [side data structure](https://github.com/google/crubit/blob/main/nullability/type_nullability.h#L185). Our approach allows us to save on memory by not actually re-instantiating everything with new sugar.
> 
> But I do have a question - how is the built-in resugarer going to do this without maintaining a side data structure like we do? Clang currently does not have a concept of non-canonical instantiations. `std::vector<int*>`, `std::vector<absl::Nullable<int*>>`, `std::vector<absl::Nonnull<int*>>` are the same canonical type, so they must share one instantiation. Are you also proposing to change that?

I think there might have been a slight miscommunication here, my fault.

The idea being considered is to just instantiate the template using the first sugaring found, using a sugared non-final substitution. That way, any semantic analysis during template instantiation would be able to produce diagnostics relative to that sugaring. But that would depend on the resugarer being fully implemented, so that any uses of that template specialization would be resugared to the correct sugar relative to the point of use. We would also need to enforce that
a template can't be instantiated using any kind of sugaring which could have a semantic effect, like the gcc alignment  attribute when applied to typedef.

As currently things stand, we would have no reason to produce a second instantiation for the same type for a template which already succeeded in instantiation, and which we will resugar later anyway, because that should not produce a different outcome.

But external analysis should be free to pursue that if needed.

> But wouldn't it be better if out-of-tree users like us were not blocked on the priorities of your work? With the `Subst*` nodes in the AST the information is available for out-of-tree users to use in any way they like. Are you sure you can predict all use cases? Do you really want to be in the business of satisfying them?

You are free to propose any AST extension that would help out-of-tree users.

That proposal has to have a reasonable justification.

But one thing I haven't mentioned before is the tradeoffs and reasoning behind the current approach of instantiating alias templates fully sugared, instead of resugaring them later.

The main reason is that as things stood before the current patch, alias templates couldn't be reliably resugared.
We don't track specializations for alias templates, like we do for classes for example.
When we instantiate a class template, any Subst* nodes produced have an AssociatedDecl which references
the class template specialization.
But for alias templates, we don't have a `TypeAliasTemplateSpecialziationDecl`, so the AssociatedDecl references just the template pattern.

This is a problem because you can run into a type which has Subst nodes from different alias template specializations, and as you can't tell them apart, you could end up producing incorrect resugaring.
This can happen because some operations can lose the template specialization type for the alias templates, which is top level sugar, but the Subst node can remain if its present in the children of a canonical node.

> Actually, resugaring in nullability analysis already works for `std::vector::push_back()`:
> As far as I see, libc++ does not actually use the allocator to define `const_reference`, `value_type` and so on:

You are right, I misremembered, that was true only for libstdc++.

> That is, the difficult part and the focus at the moment is not resugaring in the analysis. I do admit that our implementation of resugaring is probably not as comprehensive as the one that you're working on, but it works for the vast majority of real-world code that people write in our monorepo - we tweaked it based on the cases that we actually observed.

I see.
The difficulty is that we want to have an upstream clang resugarer too.
This resugarer would ideally have a negligible performance impact, so that we can always enable it, and all
users would get improved type-as-written in diagnostics without having to do anything else.

We could support external resugarers too, without considering costs, that is not unreasonable.
But if having to support an external resugarer meant we would not be able to afford to have an in-tree one, that seems unreasonable.

I think this patch is the wrong tree to bark on though.
As I said previously, before this patch, instantiation of alias templates did not provide reliable support for resugaring anyway. It might work in limited cases, provided you implement enough workarounds. But the design for it is not right.

Tracking alias template specializations would be a lot more complex and likely higher performance impact.
So it would probably be better if we could find a way your passes could work with this patch.

> Fundamentally, your proposal to reuse the built-in resugarer depends on upstreaming an incomplete language extension, which cuts against Clang's extension policy, https://clang.llvm.org/get_involved.html:
> 
> > Evidence of a significant user community: This is based on a number of factors, including an existing user community

If you are using this internally at google, that seems like evidence that there is an existing user community.

> Let me raise the level of the discussion here. We do appreciate your work on improving resugaring. However, simultaneously removing the `SubstTemplateTypeParmType` is breaking one out-of-tree language extension that is already deployed (nullability) and is blocking research and prototyping on another out-of-tree extension (lifetimes).
> 

In the course of implementing resugaring, there were several deficiencies in upstream clang which had to be corrected.
For example, we didn't have pack indexes, there was no AssociatedDecl, and some other missing things.

I think if I had the idea that I could implement the resugaring completely out-of-tree, these deficiencies would have remained, and consequently work arounds and incomplete resugaring would have to be accepted.

Since you are also using the AssociatedDecl in the type_nnullability analysis, perhaps you have noticed we couldn't
have reliably used just depth and index, like we do for normal template instantiation, because type sugar loss would present a problem.

The type alias is in the same situation here. It's AssociatedDecl is not very useful if it only points to the template pattern.

So I took the course of fixing it in upstream clang, with the aim of having the best outcome for an in-tree resugarer, which is what makes the most sense.

> Is it possible to decouple landing the improvements in resugaring and removal of `SubstTemplateTypeParmType`? The first you can land immediately, but the second blocks our work. Is it separable?

We would still need a distinction between Subst nodes that need to be resugared and those that don't.
Besides the performance impact of keeping these Subst nodes, as they affect uniquing, and they mean there is
more tree traversal needed.

If we want to support out-of-tree resugarers, I think the least bad option here would be perhaps a flag to disable in-tree resugaring.

But that doesn't solve the problem that template type aliases don't have all the necessary design bits for resugaring.
It would be strange if this flag would control what happens there, and also if the flag's on-state would provide a deficient AST, or if we would implement a `TypeAliasTemplateSpecializationDecl` only for the benefit of external resugarers.

> Could you provide a more detailed rationale for removing `SubstTemplateTypeParmType`?
> 

That's mostly explained above.

Since Subst nodes are currently heavy on sugar, and they can appear deeply nested inside the type, they can have a large impact on uniquing these types.

Since we only need to resugar one Subst node once, we need the distinction between having already handled that type, or not.

For a type alias template, there is currently no declaration that can represent its specialization.

> Generally Clang tries to preserve information in the AST as much as possible. Isn't removing this node going against this principle?

As much as possible, within reason, considering performance impact.

There are some things left out because of performance concerns, like:
* Tracking source locations for qualifiers
* Semantic adjustment markers for dropping references and qualifiers.
* Some template argument deduction sugar losses.


https://github.com/llvm/llvm-project/pull/101858


More information about the lldb-commits mailing list