[PATCH] D23135: [clang-tidy] misc-argument-comment non-strict mode

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 4 07:47:33 PDT 2016


On Thu, Aug 4, 2016 at 10:45 AM, Alexander Kornienko <alexfh at google.com> wrote:
> alexfh added inline comments.
>
> ================
> Comment at: clang-tidy/misc/ArgumentCommentCheck.cpp:124
> @@ +123,3 @@
> +  InDecl = InDecl.trim('_');
> +  return InComment.compare_lower(InDecl) == 0;
> +}
> ----------------
> aaron.ballman wrote:
>> alexfh wrote:
>> > aaron.ballman wrote:
>> > > Correct, which means this won't behave properly in some locales with UTF-8 identifiers. Consider Turkish, where İ (U+0130 “Latin Capital Letter I With Dot Above”) is the uppercase form of ı (U+0131 “Latin Small Letter Dotless I”). If the comment contains one version while the identifier contains the other, the comparison will currently fail, while a locale-aware comparison would succeed. You run into similar things with SS vs ß in German as well, where the uppercase form is two characters while the lowercase is only a single character.
>> > Interesting, though it looks like there's now an official capital ẞ https://en.wikipedia.org/wiki/Capital_%E1%BA%9E (which is not frequently needed anyway, I guess).
>> >
>> > At the end of the day, what we get is that the non-strict mode is currently somewhat stricter for non-ascii characters. Similar will happen with all other parts in LLVM that rely on `StringRef::compare_lower`. I don't think we need a separate test for this _here_, since it's a problem on a completely different level. And I guess the use non-ascii identifiers in C++ will cause much more serious problems than a slightly stricter clang-tidy warning ;]
>> We may just have different testing philosophies -- I would advocate for a test because we know of a use case that's broken with this particular use of `compare_lower`. Not all uses of `compare_lower` are problematic, after all. However, I'm not going to fight for that test case too hard because this is hopefully an edge case that is low-impact. A FIXME would also suffice.
> I'm reluctant to add a case, since the cost of making it work and maintaining on both linux and windows is higher than the value of it, IMO (it's my take out from writing clang-format's limited support for Unicode).

I am totally okay with that line of reasoning. I was mostly looking
for some marker that says "if this acts funky, it's expected, not
accidental." The FIXME scratches that itch for me, so thank you!

~Aaron

>
>
> https://reviews.llvm.org/D23135
>
>
>


More information about the cfe-commits mailing list