[clang] [Bounds Safety][NFC] Add `SemaBoundsSafety` class and move existing Sema checks there (PR #98954)

Vlad Serebrennikov via cfe-commits cfe-commits at lists.llvm.org
Tue Jul 16 07:09:25 PDT 2024


Endilll wrote:

> > Unfortunately, I think this should be held off until we have a bigger picture for the future of `Sema`. I'll elaborate below.
> > If you take a close look at the existing set of `Sema` parts, it's almost entirely comprised of other languages or language extensions (from C and C++ standpoint) and backends. They were considered natural enough to be split off `Sema`, which, among other things, means that it's easy to explain their place in the big picture, like I did in the previous sentence.
> 
> As noted above `-fbounds-safety` **is a C language extension** which makes it seem like it would fit nicely into the existing division of Sema into multiple objects and relevant source files.

No, it don't fit nicely into the division, which is the reason we're having this discussion.

> > I don't think this patch makes it easier to explain people where things are in this codebase, because (I think) `counted_by` is a declaration attribute, and we already have place for them. Having only one function also makes it a very small part of `Sema`, which doesn't sound compelling.
> 
> It doesn't right now because most of `-fbounds-safety` implementation isn't upstream yet. I simply moved what is **currently upstream**. As we (Apple) upstream more and more of the implementation it makes a lot sense to try to keep as much of it in its own source file to:
> 
>     * Create a clean separation between `-fbounds-safety` Sema checks and everything else. This will ultimately make exploring the code easier once a significant amount of our implementation has been upstreamed.
> 
>     * Avoid growing all the existing `Sema*.cpp` files unnecessarily. They are already huge and we don't want to make the problem worse when there's an easy option available to avoid that.

You can have `SemaBoundsSafety.cpp` and your own "section" inside `Sema.h`, like C++ features do (like templates or lambdas). There's no reason to make separation physical by introducing `SemaBoundsSafety` class from the get-go.

> > A bigger picture this fits in would be a compelling argument (like it was for small backend parts), but to my knowledge there's none.
> 
> The `counted_by` attribute is just one of several attributes that are part of the `-fbounds-safety` language extension which is documented here https://clang.llvm.org/docs/BoundsSafety.html. That's "bigger picture" for the feature, or did you mean something else by "bigger picture this fits in"?

A bigger picture is "what are we going to do with the rest of the attributes in `SemaDeclAttr.cpp`?". Even bigger one is "what do we do about SemaDecl, SemaExpr, SemaDeclCXX, and SemaExprCXX?"

> > Coming up with one and getting maintainers on board is certainly out of scope of this PR, so don't feel obligated to do that. Hopefully I'll get back to this whole topic later.
> 
> My end goal is to have "somewhere" to put all of the Sema code that supports `-fbounds-safety` that we will upstream. I think it's very much preferable to have this code go into its own source file and header file for the reasons I gave above.
> 
> We don't have to use the sub-class `SemaBase` mechanism that I've used here if this is really a problem. However, the mechanism does seem like a good fit for `-fbounds-safety` and would be a shame not to use it.

> > "Extension" is definitely quite broad. What I meant are (basically) languages that are based off C or C++. Would you argue that `-fbounds-safety` fits into a set of OpenMP, OpenACC, CUDA, HLSL, Objective-C, and Swift?
> 
> In my opinion it fits in the set because it is a (pretty large) C language extension.
> 
> > > It's currently in the process of being upstreamed which is why it's small, but it'd be easier to split now than wait until it's reached a certain size, if we do want to split it eventually.
> > 
> > 
> > I did that a lot with other parts of `Sema`, and it's not that hard, unless it grows comparable to `SemaExprCXX.cpp` (which I consider unlikely to happen). I'd rather see this being upstreamed into the existing `Sema` structure as it has been planned all along, and use it as an input for the design of further `Sema` splitting, rather than committing today that `SemaBoundsSafety` is going to be a thing.
> 
> I don't agree with this approach. I outlined above why I think it makes a lot of sense to keep the `-fbounds-safety` Sema code in its own source file and blocking doing that on a _potential future refactor_ that might never happen doesn't seem like the right approach to me.

I have to say that `-fbounds-safety` upstreaming is in the same boat of something that might never happen.

> If at some point we come up with some new design for Sema we can easily move the `-fbounds-safety` code out of its own source file (and class) and into the relevant locations required by the redesign. If the `-fbounds-safety` code is littered all over the other Sema files the _potential future refactor_ could be much harder because finding all `-fbound-safety` code now is much more time consuming.

That's why I propose to follow long-established practice of doing `SemaBoundsSafety.cpp`, and move that around later. What I'd like to evaluate before deciding on `SemaBoundsChecking` is how big its interface is (what would be exposed via `SemaBoundsChecking` class,)

https://github.com/llvm/llvm-project/pull/98954


More information about the cfe-commits mailing list