[PATCH] D49114: [clang-tidy] Add a check for "magic numbers"

Sat Jul 28 12:44:26 PDT 2018

aaron.ballman added inline comments.

================
Comment at: docs/clang-tidy/checks/readability-magic-numbers.rst:61-63
+configuration for accepted floating point values, primarily because most
+floating point comparisons are not exact, and some of the exact ones are not
+portable.
----------------
0x8000-0000 wrote:
> lebedev.ri wrote:
> > aaron.ballman wrote:
> > > 0x8000-0000 wrote:
> > > > 0x8000-0000 wrote:
> > > > > lebedev.ri wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > 0x8000-0000 wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > I am curious to know how true this is. You got some data for integer values and reported it, but I'm wondering if you've tried the same experiment with floating-point numbers?
> > > > > > > > The problem with the floating point numbers as text is: they need to be parsed both from the configuration and from the source code _then_ compared. What is an acceptable epsilon? I don't know. Is the same epsilon acceptable on all source code? I don't know.
> > > > > > > Yeah, I'm not too worried about the situations in which the epsilon matters. I'm more worried that we'll see a lot of 1.0, 2.0 floating-point literals where the floating-point value is a nice, round, easy-to-represent number but users have no way to disable this diagnostic short of `const float Two = 2.0f;`
> > > > > > Random thought: the types that are ignored should/could be configurable, i.e. there should be a switch
> > > > > > whether or not to complain about floats.
> > > > > Even though they might be nice and round... they should mean _something_ other than 'Two'.
> > > > > 
> > > > > The thing is... magic integers are used as buffer sizes, or to map things that are discrete in nature - number of legs of a typical mammal for instance. Not sure what magic numbers exist in nature besides pi and e and some fundamental physical constants )Avogadro's number, etc). But even there, it is better to use a symbolic constant.
> > > > Actually that is a _great_ idea, thank you!
> > > > The thing is... magic integers are used as buffer sizes, or to map things that are discrete in nature - number of legs of a typical mammal for instance. Not sure what magic numbers exist in nature besides pi and e and some fundamental physical constants )Avogadro's number, etc). But even there, it is better to use a symbolic constant.
> > > 
> > > That's my point -- I think there's a lot of uses of round floating-point values that are not magical numbers, they're sensible constants. Looking at LLVM's code base shows a *lot* of 1.0 and 2.0 values (hundreds of instances from a quick text-based search). No one should be forced to turn those into named constants. However, I've seen code using `1.02` and `.98` in places -- those seem like sensible things to make named constants because the values have semantically interesting meaning to the surrounding code.
> > > 
> > > > Random thought: the types that are ignored should/could be configurable, i.e. there should be a switch
> > > whether or not to complain about floats.
> > > 
> > > I think this would be a useful option, for sure (I used to work at a place that did a ton of floating-point math that would benefit from the integer side of this check but could never use the floating-point side of it). However, the presence of such an option doesn't give us a pass on coming up with a data-driven list of default values to ignore for the floating-point side. If we don't want to make that list configurable, I think that's something we can discuss (I think I'm fine with not making it a user-facing configuration option). But I think that `0.0` is insufficient.
> > Yep. I didn't mean for that flag to be a replacement for the ignore-list for fp constants.
> This opens up an entire can of worms that requires quite a bit of thought and invites significant amount of bikesheding. Is "2.00" as acceptable as "2.0"? Do we compare textually, or with regular expressions? Is 2.00000001 represented the same way on PowerPC and ARM as on Intel?
> 
> As this check is specified and implemented right now, people have the following options:
> * not use the check at all
> * use it to flag all literals
> * use it to flag only integral literals
> * use it to flag all integral literals not on a white list.
> 
> If somebody can come up with a more logical spec for how to filter out certain floating point values - they can definitely extend this check, without impacting users that are happy with the menu offered above.
> 
> Does this sound reasonable?
> This opens up an entire can of worms that requires quite a bit of thought and invites significant amount of bikesheding. Is "2.00" as acceptable as "2.0"? Do we compare textually, or with regular expressions? Is 2.00000001 represented the same way on PowerPC and ARM as on Intel?

I guess my mental model for this check was a bit easier than that. We don't care about 0x2 vs 2, for instance, so why would we care about 2.0 vs 2.00? It's a value-based check, so it checks the values of the literals, just like with integers. If we want to expose a configuration list, we can push those strings into an `APFloat` and get their value as well. APFloat handles the underlying floating-point model semantics, so I don't think this will be particularly problematic. Once we have some data on the common floating-point literal values we can worry about equality issues; my hunch is that they won't be problematic because the most common literals are likely to be small, whole numbers that are precisely represented.

> If somebody can come up with a more logical spec for how to filter out certain floating point values - they can definitely extend this check, without impacting users that are happy with the menu offered above.
> Does this sound reasonable?

No, because I believe this check, as is, has too high of a false positive rate due to the floating-point literal portion of the check, which is defaulted on. I'm also hesitant to release the check where the options for floating-point values are either: on and everything other than 0.0 must have a named constant, or off.

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D49114