[all-commits] [llvm/llvm-project] 355532: [Clang] Add a warning on invalid UTF-8 in comments.

cor3ntin via All-commits all-commits at lists.llvm.org
Sat Jul 9 02:26:59 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 355532a1499aa9b13a89fb5b5caaba2344d57cd7
      https://github.com/llvm/llvm-project/commit/355532a1499aa9b13a89fb5b5caaba2344d57cd7
  Author: Corentin Jabot <corentinjabot at gmail.com>
  Date:   2022-07-09 (Sat, 09 Jul 2022)

  Changed paths:
    M clang/docs/ReleaseNotes.rst
    M clang/include/clang/Basic/DiagnosticLexKinds.td
    M clang/lib/Lex/Lexer.cpp
    A clang/test/Lexer/comment-invalid-utf8.c
    A clang/test/Lexer/comment-utf8.c
    M clang/test/SemaCXX/static-assert.cpp
    M llvm/include/llvm/Support/ConvertUTF.h
    M llvm/lib/Support/ConvertUTF.cpp

  Log Message:
  -----------
  [Clang] Add a warning on invalid UTF-8 in comments.

Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059




More information about the All-commits mailing list