[PATCH] D156518: Fix handling of medial hyphens in Unicode Names.

Corentin Jabot via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jul 28 02:28:48 PDT 2023


cor3ntin created this revision.
Herald added subscribers: mstorsjo, hiraditya, dschuff.
Herald added a project: All.
cor3ntin requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

In a Unicode name was stored in a way that caused
a medial hyphen to be at the end of a a chunk, it would not
be properly ignored by the loose matching algorithm.

For example if `LEFT-TO-RIGHT OVERRIDE` was stored as
`LEFT-` [...], the `-` would not be ignored.

The generators now ensures nodes are not cut accross
medial hyphen boundaries.

Fixes #64161


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D156518

Files:
  clang/docs/ReleaseNotes.rst
  clang/lib/Lex/LiteralSupport.cpp
  clang/test/Preprocessor/ucn-pp-identifier.c
  llvm/lib/Support/UnicodeNameToCodepoint.cpp
  llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
  llvm/unittests/Support/UnicodeTest.cpp
  llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp



More information about the cfe-commits mailing list