[PATCH] D131926: [clang-tidy] Fix for bugprone-sizeof-expression PR57167

Chris Hamilton via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Aug 17 14:58:31 PDT 2022


chrish_ericsson_atx added inline comments.


================
Comment at: clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression.cpp:236-238
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
   sum += sizeof(PMyStruct);
-  // CHECK-MESSAGES: :[[@LINE-1]]:10: warning: suspicious usage of 'sizeof(A*)'; pointer to aggregate
----------------
aaron.ballman wrote:
> mizvekov wrote:
> > aaron.ballman wrote:
> > > mizvekov wrote:
> > > > chrish_ericsson_atx wrote:
> > > > > mizvekov wrote:
> > > > > > chrish_ericsson_atx wrote:
> > > > > > > njames93 wrote:
> > > > > > > > mizvekov wrote:
> > > > > > > > > Is this change really desirable, or should we put a FIXME here?
> > > > > > > > Not warning on these cases seems like a pretty big red flag, especially the `MyStruct *`.
> > > > > > > Thank you for your comment!  I'm not sure I fully agree that this is a red flag.  I'm inclined to think that whether or not there should be a warning on `MyStruct *` or `PMyStruct` is a pretty subjective.  These are both very common idioms, and are meaningful.  I do appreciate the perspective that `MyStruct *` is just one character different from `MyStruct`, and as such, it might be a typo to ask for sizeof the pointer, when you really wanted sizeof the struct.  But the counter-argument (accidentally asking for sizeof the struct when you really wanted the pointer size) is just as valid-- and the checker obviously cannot warn in that case.   
> > > > > > > 
> > > > > > > I am perfectly fine with kicking the can down the road a bit by replacing the discarded `// CHECK-MESSAGES` directive with a `// FIXME`, as @mizvekov suggested.  I expect that when someone circles back to really deeply reconsider the semantics of this checker, there will be a number of changes (other existing warnings dropped, warnings for new and missed cases added, much better sync between C and C++, as well as intentional consideration of things like __typeof__ (in it's various forms) and decltype, rather than the haphazard coarse-grain matching that seems to be going on now.
> > > > > > I agree with this patch only in so far that:
> > > > > > 
> > > > > > * This is effectively a partial revert of the changes made in https://reviews.llvm.org/rG15f3cd6bfc670ba6106184a903eb04be059e5977
> > > > > > * The checker was pretty much buggy to begin with.
> > > > > > * That change significantly increased the amount of patterns we would accept, but at the same time the existing tests did not pick that up and this was not carefully considered.
> > > > > > * It seems unreasonable that there is effectively no way to shut this warning up per site, except by a NOLINT directive.
> > > > > > 
> > > > > > If the amount of false positives is so high now that this check is unusable, then this is just a question of reverting to a less broken state temporarily.
> > > > > > 
> > > > > > But otherwise we can't leave it in the reverted state either. Without that change to use the canonical type or something similar, there is no reason to suppose any of these test cases would work at all as clang evolves and we improve the quality of implementation wrt type sugar retention.
> > > > > @mizvekov, I agree with your reasoning for saying "we can't leave it in the reverted state either".  But I'm not sure how to ensure that this gets the needed attention.  Should we just file a separate PR on github to track the needed refactoring?  I do not believe I'll have the bandwidth to look at this in the next few months.
> > > > > 
> > > > > In the meantime, assuming the bots are happy with patchset 2, I'll land this as-is later today.
> > > > > 
> > > > > Thank you very much for your feedback!
> > > > Oh please wait for others to review, I don't think landing today is reasonable!
> > > > 
> > > > I am not really a stakeholder for this checker except for that original change.
> > > > 
> > > > I would advise for you to wait for another more responsible reviewer to accept as well before merging, unless this gets stale for a long time and no one else seems to be interested.
> > > > Thank you for your comment! I'm not sure I fully agree that this is a red flag. I'm inclined to think that whether or not there should be a warning on MyStruct * or PMyStruct is a pretty subjective. These are both very common idioms, and are meaningful. I do appreciate the perspective that MyStruct * is just one character different from MyStruct, and as such, it might be a typo to ask for sizeof the pointer, when you really wanted sizeof the struct. But the counter-argument (accidentally asking for sizeof the struct when you really wanted the pointer size) is just as valid-- and the checker obviously cannot warn in that case.
> > > 
> > > I agree with @njames93 that this is a red flag. That check behavior is explicitly documented: https://releases.llvm.org/13.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/bugprone-sizeof-expression.html#suspicious-usage-of-sizeof-a so this change is introducing a different kind of regression.
> > > 
> > > (However, there's no option for disabling that specific situation and I think there probably should be one, eventually.)
> > This revert is almost entirely knocking out this `PointerToStructType` matcher, which matches `sizeof` of any record which is not written as an address-of operator expression.
> > 
> > I think with this revert and after the elaboratedType patch changes, the only case that should still trigger it should be a type substitution for a template argument written as a pointer to Record, because that should canonicalize the pointee without adding any other sugar on it.
> > 
> > It won't match a `pointer-to-sugar-to-record` anymore at all, as it happened before the ElaborateType patch, but before that patch, there was an additional case that would match: A pointer to a struct written without any elaboration, because we would not put any sugar on top of the RecordType.
> > 
> > The documentation does mention the `Suspicious usage of ‘sizeof(A*)’` case, but it only gives as an example the `sizeof-address-of-expression` case, even though that is handled separately in the implementation here and not affected.
> > 
> > So I guess that might explain the contention here about the seriousness of this regression.
> > The documentation does mention the Suspicious usage of ‘sizeof(A*)’ case, but it only gives as an example the sizeof-address-of-expression case, even though that is handled separately in the implementation here and not affected.
> >
> > So I guess that might explain the contention here about the seriousness of this regression.
> 
> Yeah, those docs only show the expression case and not the type case. That's an interesting point! It also says `These cases may occur because of explicit cast or implicit conversion.` which sure implies that this is an expression check rather than a type check.
> 
> I dug through the history to find the original review for this feature to see if this was discussed. That review is: https://reviews.llvm.org/D19014 I didn't spot any indication this was meant to diagnose `sizeof(some_type)`.
> 
> So I guess I'm wrong on this being a red flag!
I understand the concerns.  And I agree with @mizvekov's rationale that the documentation doesn't explicitly cover the two pre-existing cases that this change affects.  

To further support that, the documentation describing diags for `sizeof(A*)` were introduced when the checker was introduced in https://reviews.llvm.org/D19014 in 2016.  But the two test cases that are affected by my change were introduced 3 years later in https://reviews.llvm.org/D61260, and documentation was not updated as part of that later change.

That later change by @baloghadamsoftware seems to have been intended to broaden the checker to explicitly catch attempts to take sizeof struct-pointer or values of that type -- and my change largely reverts that behavior, but doesn't affect the original behavior of the checker (as documented).  Looking at the review comments, I don't get a sense that the false positive rate for `sizeof(A*)` was considered carefully, and in particular, there was no discussion of targets with multiple address spaces and different size pointers, respectively, where one would expect to see `sizeof(A*)` done rather commonly, and clearly intentionally.

I reached out separately to Ádám for guidance on this change, and he's included on this review.  I'll wait a bit to give him an opportunity to comment here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131926/new/

https://reviews.llvm.org/D131926



More information about the cfe-commits mailing list