[PATCH] D117522: [clang-tidy] Add modernize-macro-to-enum check

Richard via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 2 08:43:23 PST 2022


LegalizeAdulthood added a comment.

I think it helps to understand this check better if we consider what
we would do manually if we were modernizing code with a lot of
macro-style enums in it.  I've done this myself; in fact all the checks
I've written for clang-tidy have been motivated by modernizing dusty
deck C code that was converted to C++.

The motivating example is on github <https://github.com/LegalizeAdulthood/iterated-dynamics> and you can find my manual
refactorings in the commit history.

A simple behavior preserving first pass is to hoist all integral constant
macros to anonymous unscoped enums.  A second pass is to examine
related anonymous unscoped enums and hoist them into named unscoped
enums.  If they are used only as discriminating values, e.g. not used in
arithmetic expressions, then they can be hoisted to a named scoped
enum and you can "lean on the compiler" to adjust the relevant
declarations of variables and function parameters.

This check implements the first pass.  My intention is to create another
check that assists in the "hoist a named unscoped enum to a named
scoped enum" which a) removes any common enumeration prefix from
all the enumerations (since they are now uniquely identified by scope),
b) adds the necessary scope qualifiers to all uses of the enumerations,
c) propagates the necessary declaration changes to all variables and
parameters.  Obviously such a check involves quite a bit of analysis
of all the usages of an enumeration and may not even be feasible in
the context of clang-tidy since it can only examine a single translation
unit at a time, but this is the goal.

Note that this still leaves humans in the middle deciding which clumps
of anonymous unscoped enums represent semantically related items
and giving them an appropriate name.

In D117522#3290604 <https://reviews.llvm.org/D117522#3290604>, @carlosgalvezp wrote:

> Since this is a `modernize` check, shouldn't it be proposing to use `enum class` instead of `enum`?

One step at a time `:)`.  In short, the answer is no because of the
way preprocessor macros interact with the rest of the code.  While
it is safe to hoist the macro into an unscoped enum without analyzing
all the macro expansion sites, the same can't be said for a scoped
enum.

First, we would have to determine a name for the scoped enum.  This
may not be possible, depending on the macros.  For instance some
macros are really just named constants, not enumerations, but we
can't tell from the macro itself.  (There are heuristics you can apply,
and I have done some of that in this check, but the only way to know
for certain whether a macro is a named constant or intended to be
a set of related names is manual inspection.)

Second, this check can apply to C source files as well as C++ source
files.  C doesn't have scoped enums, but it does have enums and they
can be anonymous.

> This will conflict with `Enum.3` from `cppcoreguidelines`,

You're correct that both this check and `cppcoreguidelines-macro-usage`
will make suggestions for the same macro.  Again, the problem with
a macro is that it's hard to identify which is an enum and which is a
named constant.

If I have a block of macros that represent enumerations, I'm not going
to be happy with `cppcoreguidelines-macro-usage` suggesting that
they all be turned into `constexpr` declarations.

If I have a block of macros that represent named constants, I'm not
going to be happy with `modernize-macro-to-enum` suggesting that
they all be turned into anonymous `enum` declarations.

I don't think there's any clear winner over which should be chosen by
an automated tool.

If it were me, I'd apply modernize-macro-to-enum first and then
cppcoreguidelines-macro-usage.  Why?  Because I'd then have enums
to scoop up all the integral constants, which is a more specific language
construct, and then I'd have `constexpr` constants for the macros that
expand to floating-point and string values.  I'd still be left with function-
like macros that macro-usage warned me about and didn't fix.

> Or at least make the default use `enum class` and leave it as an
> option to use unscoped enums to be able to apply the fix.

An additional problem with scoped enums, besides having to invent
a name for it, is that you have to analyze all the expansion sites of the
defined macros in order to determine if the macro expands into a value
that is used in an expression.

Consider this code:

  #define FOO_ARG1 0x1
  #define FOO_ARG2 0x2
  #define FOO_ARG3 0x3
  
  void f()
  {
    int options = FOO_ARG1 | FOO_ARG3;
    g(options);
  }

If I replace the macros with a scoped enum, now I've created a compilation
error at the line initializing `options` and I've created a compilation error
at the line calling `g()`.  An anonymous unscoped enum doesn't suffer from
these problems.

> Still I find it a bit inconsistent with the rest of the `modernize` module
> (modernize as in "use post-C++11 stuff").

There is a lot of legacy C-isms in C++ code and not all modernization
transformations imply a shift to C++11.  For instance, another modernizing
checker that I wrote -- modernize-redundant-void-arg -- applies to C++98
code.

> Another point is that macros typically have a different naming convention
> than enums or constants, so users will likely have to do manual work anyway,
> or rely on the "readability-identifier-naming" check - how does it play out
> then if 2 checks propose different fixes?

The difficulty of naming is why this check doesn't propose names.  They have
to be chosen by humans.  The readability check imposes a naming convention
on the identifiers but it doesn't impute semantics to those identifiers, it simply
checks that they follow a certain lexicographical convention.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117522/new/

https://reviews.llvm.org/D117522



More information about the cfe-commits mailing list