[PATCH] D86671: [clang-tidy] Add new case type to check variables with Hungarian notation

Douglas Chen via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 30 09:57:07 PDT 2020


dougpuob added a comment.

In D86671#2293128 <https://reviews.llvm.org/D86671#2293128>, @aaron.ballman wrote:

> In D86671#2284078 <https://reviews.llvm.org/D86671#2284078>, @dougpuob wrote:
>
>> Hi @aaron.ballman
>> About changing  `size_t nLength` to `cbLength`. I searched MSVC folders with `size_t`, many names of variable start with `n`, or `i` in MFC related files. So I prefer to keep it starts with `n`. Another side to name starts with `cb`, I found variables like cbXxx are defined with ULONG, DWORD, or UCHAR type.
>
> I think the important point is that `cb` is used for APIs that specify a count of bytes, whereas `i`, `n`, and `l` are used for integers (there is no specific prefix for `size_t` that I'm aware of or that MSDN documents, so the lack of consistency there is not too surprising). That's why I mentioned that using a heuristic approach may allow you to identify the parameters that are intended to be a count of bytes so that you can use the more specific prefix if there's more specific contextual information available.

Hi @aaron.ballman

1. **Making `sz` as a prefix to the `char*` parameters of `strncpy()` and `strncat()` functions.** I read the code of `strncpy` function, seems null-terminated to parameters is essential. I thought those parameters to strXxx() are also able to the parameter of `strlen` function knowing the length of a C string(it can be passed to be a parameter or assigned to be a variable, those point to the identical memory. strXxx functions tell the end by null character in the memory). Moreover, I searched source code of OpenSSL, I found the project uses most of `char*` as C string, and data buffer to retrieve data with `unsigned char*`.  `unsigned char` is a primitive type in C89 (is portable than `uint8_t`). Maybe current mechanism is a good way to hint users that `unsigned char*` is more explicit than `char*` if they want to treat it as a buffer to retrieve data.

2. **Adding heuristics to pick the correct prefix.** Do you mean that use the heuristics approach to give naming suggestion as warning message, or correct it with the `--fix` option? Is there any existing in this project can be a sample for me? If my thought about the heuristics is correct. The information of parameters to functions I can query its **type**, **name**, and **location**. As we have discussed that the declaration type is insufficient to tell `sz` for `char*`, or `cb` for count of bytes instead `i`, `n`, or `l`, so the **name** and **location** provide more information for comparing names with string mapping tables and location relationship between parameters/variables. Take `FILE * fopen ( const char * filename, const char * mode );` for example(C String). It will change `filename` to `szFilename` if the `name` string is in the mapping table, change `mode` to `pcMode` if `mode` string is not in the mapping table. Take `size_t fread ( void * ptr, size_t size, size_t count, FILE * stream );` for example(count of bytes). It will change `size` to `cbSize`, because its previous parameter is a void pointer.

  I can smell that there is always exceptional cases, and not good for maintainability.

3. **Changing `i`, `n`, and `l` to `cb` for APIs that specify a count of bytes.** I have no idea how to distinguish when should add the `cb` instead of integer types for heuristics. If I was the user, I wish clang-tidy don't change integer types from `cbXxx` to `iCbXxx` or `nCbXxx`. Keep `cb` with default or an option in config file, like `IgnoreParameterPrefix`, `IgnoreVariablePrefix`.

4. **Why not use Decl->getType()->getAsString()** I printed the differences as a log for your reference. (https://gist.github.com/dougpuob/e5a76db6e2c581ba003afec3236ab6ac#file-clang-tidy-readability-identifier-naming-hungarian-notation-log-L191) Previous I mentioned "if users haven't specific the include directories, the checking result may look messy". This concern will not happened because the `clang-diagnostic-error`, if users didn't specific header directories, clang-tidy will show the error (L414 <https://gist.github.com/dougpuob/e5a76db6e2c581ba003afec3236ab6ac#file-clang-tidy-readability-identifier-naming-hungarian-notation-log-L416>).




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86671/new/

https://reviews.llvm.org/D86671



More information about the cfe-commits mailing list