[all-commits] [llvm/llvm-project] d46fa0: [clang-format] SortIncludes should support "@impor...

Konrad Kleine via All-commits all-commits at lists.llvm.org
Wed Apr 20 00:22:09 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: d46fa023caa2db5a9f1e21dd038bcb626261d958
  Author: Konrad Kleine <kkleine at redhat.com>
  Date:   2022-04-20 (Wed, 20 Apr 2022)

  Changed paths:
    M clang/include/clang/Tooling/Inclusions/HeaderIncludes.h
    M clang/lib/Format/Format.cpp
    M clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
    M clang/unittests/Format/SortIncludesTest.cpp

  Log Message:
  [clang-format] SortIncludes should support "@import" lines in Objective-C

Fixes [[ https://github.com/llvm/llvm-project/issues/38995 | #38995 ]]

This is an attempt to modify the regular expression to identify
`@import` and `import` alongside the regular `#include`. The challenging
part was not to support `@` in addition to `#` but how to handle
everything that comes after the `include|import` keywords. Previously
everything that wasn't `"` or `<` was consumed. But as you can see in
this example from the issue #38995, there is no `"` or `<` following the

@import Foundation;

I experimented with a lot of fancy and useful expressions in [this
online regex tool](https://regex101.com) only to find out that some
things are simply not supported by the regex implementation in LLVM.

 * For example the beginning `[\t\ ]*` should be replacable by the
   horizontal whitespace character `\h*` but this will break the
   `SortIncludesTest.LeadingWhitespace` test.

That's why I've chosen to come back to the basic building blocks.

The essential change in this patch is the change from this regular

^[\t\ ]*#[\t\ ]*(import|include)[^"<]*(["<][^">]*[">])
        ~                              ~~~~~~~~~~~~~~
        ^                              ^
        |                              |
        only support # prefix not @    |
                                       only support "" and <> as
                                       no support for C++ modules and ;
                                       ending. Also this allows for ">
                                       or <" or "" or <> which all seems
                                       either off or wrong.

to this:

^[\t\ ]*[@#][\t\ ]*(import|include)([^"]*("[^"]+")|[^<]*(<[^>]+>)|[\t\
        ~~~~                        ~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~
        ^                                 ^           ^       ^       ^
        |                                 |           |       |       |
        Now support @ and #.            Clearly support "" and <> as
well as an
                                        include name without enclosing
                                        Allows for no mixture of "> or
<" or
                                        empty include names.


Here is how I've tested this patch:

ninja clang-Format
ninja FormatTests

And if that worked I doubled checked that nothing else broke by running
all format checks:


One side effect of this change is it should partially support
[C++20 Module](https://en.cppreference.com/w/cpp/language/modules)
`import` lines without the optional `export` in front. Adding
this can be a change on its own that shouldn't be too hard. I say
partially because the `@` or `#` are currently *NOT* optional in the
regular expression.

I see an opportunity to optimized the matching to exclude `@include` for
example. But eventually these should be caught by the compiler, so...

With my change, the matching group is not at a fixed position any
longer. I decided to
choose the last match (group) that is not empty.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D121370

More information about the All-commits mailing list