[PATCH] D140696: [clang][dataflow] Treat unions as structs.

Yitzhak Mandelbaum via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 3 04:55:14 PST 2023


ymandel added a comment.

In D140696#4020209 <https://reviews.llvm.org/D140696#4020209>, @merrymeerkat wrote:

> @ymandel - thank you for pointing all of this out! And no need to apologise at all :)
>
> I agree that the changes made here are not ideal. But the alternative is also not ideal.
>
> The question becomes: is it better to model unions in a way that is lacking, or to not model them at all?
>
> To model unions properly, we would need to be able to set the storage location of all the members of a union to be the same. But storage location is tied to type, so we can't create the same storage location for members of different types. From what I can tell, correctly modelling unions would require us to do an overhaul of how storage locations are implemented.
>
> Like Dmitri said, I think most of the issues with this simple approach to modelling unions would only be revealed with UB code. So, unless we want to compare the storage locations of pointers to different union members (which I don't think will be a very common use), this modelling should work all right for defined behaviour. I added a documentation comment regarding the lacking implementation of storage location.
>
> Coming back to the question above. Yitzhak, you seem to prefer not modelling union at all rather than having this model that falls short. Dmitri seems to prefer a model that works on most cases than no model. I am a bit torn, but I think I'm leaning a bit on the side of having a model that falls short. I don't think we plan on doing the storage location overhaul anytime soon, and until then it'd be good to at least have some functionality.
>
> Happy new year by the way!!

Happy new year!  I should clarify: Dmitri convinced me that this form of modeling is ok for unions. My remaining concern is testing: that we've seen many things that seem ok on paper and then results in difficult-to-debug crashes when run over a large corpus. Some of those could come from the implementation itself (bugs or we reasoned incorrectly about how it works or there is UB in the code that breaks our assumptions, etc.); some of it can come about simply because we've now enabled analysis of a larger range of code paths. So, the prudent approach is to test a change like this thoroughly before committing it. In contrast, the change you need just to avoid the crash is quite small and limited in its scope.

That said, we don't know of any large users depending on this analysis, and we don't (yet) have any automated setup for running over a large corpus, so I'm ok going ahead to unblock you.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140696/new/

https://reviews.llvm.org/D140696



More information about the cfe-commits mailing list