[all-commits] [llvm/llvm-project] bc40b7: [clang-format] Correctly parse C99 digraphs: "<:", ...

Marek Kurdej via All-commits all-commits at lists.llvm.org
Wed Feb 2 01:25:35 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bc40b76b5b95837e27217de6a446eeeace695f34
      https://github.com/llvm/llvm-project/commit/bc40b76b5b95837e27217de6a446eeeace695f34
  Author: Marek Kurdej <marek.kurdej+llvm.org at gmail.com>
  Date:   2022-02-02 (Wed, 02 Feb 2022)

  Changed paths:
    M clang/lib/Format/Format.cpp
    M clang/unittests/Format/FormatTest.cpp

  Log Message:
  -----------
  [clang-format] Correctly parse C99 digraphs: "<:", ":>", "<%", "%>", "%:", "%:%:".

Fixes https://github.com/llvm/llvm-project/issues/31592.

This commits enables lexing of digraphs in C++11 and onwards.
Enabling them in C++03 is error-prone, as it would unconditionally treat sequences like "<:" as digraphs, even if they are followed by a single colon, e.g. "<::" would be treated as "[:" instead of "<" followed by "::". Lexing in C++11 doesn't have this problem as it looks ahead the following token.
The relevant excerpt from Lexer::LexTokenInternal:
```
        // C++0x [lex.pptoken]p3:
        //  Otherwise, if the next three characters are <:: and the subsequent
        //  character is neither : nor >, the < is treated as a preprocessor
        //  token by itself and not as the first character of the alternative
        //  token <:.
```

Also, note that both clang and gcc turn on digraphs by default (-fdigraphs), so clang-format should match this behaviour.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D118706




More information about the All-commits mailing list