[PATCH] D112916: Confusable identifiers detection

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 18 09:59:30 PST 2022


aaron.ballman added a comment.

In D112916#3331592 <https://reviews.llvm.org/D112916#3331592>, @cor3ntin wrote:

> In D112916#3314184 <https://reviews.llvm.org/D112916#3314184>, @cor3ntin wrote:
>
>> @aaron.ballman Thanks for the ping.
>>
>> One one hand, I agree with you, on the other hand, this tries to stick to TR39, and I think we should stick with that. It might be worth checking with the Unicode consortium what they think of i/l as confusable.
>
> Actually @aaron.ballman, Unicode does consider these confusables already
>
> from https://www.unicode.org/Public/security/14.0.0/confusables.txt
>
>   0031 ;	006C ;	MA	# ( 1 → l ) DIGIT ONE → LATIN SMALL LETTER L	# 
>   0049 ;	006C ;	MA	# ( I → l ) LATIN CAPITAL LETTER I → LATIN SMALL LETTER L	# 
>   0030 ;	004F ;	MA	# ( 0 → O ) DIGIT ZERO → LATIN CAPITAL LETTER O	# 
>
> So ASCII is already taken care of. No issue here :)

Oh, so it is, good catch! I looked for it before, but missed it.

FWIW, I agree with sticking with what Unicode defines here. (I am worried about the results in practice and whether enough people will use this check to warrant its inclusion in clang-tidy, but not enough to block this patch.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112916/new/

https://reviews.llvm.org/D112916



More information about the cfe-commits mailing list