[all-commits] [llvm/llvm-project] 232cc4: [pseudo] Only expand UCNs for raw_identifiers

Sam McCall via All-commits all-commits at lists.llvm.org
Thu May 5 23:54:12 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 232cc446ff7b39a625ddae8a6a8beb3ed2bfb557
  Author: Sam McCall <sam.mccall at gmail.com>
  Date:   2022-05-06 (Fri, 06 May 2022)

  Changed paths:
    M clang-tools-extra/pseudo/include/clang-pseudo/Token.h
    M clang-tools-extra/pseudo/lib/Lex.cpp
    A clang-tools-extra/pseudo/test/crash/backslashes.c
    M clang-tools-extra/pseudo/tool/ClangPseudo.cpp

  Log Message:
  [pseudo] Only expand UCNs for raw_identifiers

It turns out clang::expandUCNs only works on tokens that contain valid UCNs
and no other random escapes, and clang only uses it on raw_identifiers.

Currently we can hit an assertion by creating tokens with stray non-valid-UCN
backslashes in them.

Fortunately, expanding UCNs in raw_identifiers is actually all we need.
Most tokens (keywords, punctuation) can't have them. UCNs in literals can be
treated as escape sequences like \n even this isn't the standard's
interpretation. This more or less matches how clang works.
(See https://isocpp.org/files/papers/P2194R0.pdf which points out that the
standard's description of how UCNs work is misaligned with real implementations)

Differential Revision: https://reviews.llvm.org/D125049

More information about the All-commits mailing list