[all-commits] [llvm/llvm-project] d46fa0: [clang-format] SortIncludes should support "@impor...
Konrad Kleine via All-commits
all-commits at lists.llvm.org
Wed Apr 20 00:22:09 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: d46fa023caa2db5a9f1e21dd038bcb626261d958
https://github.com/llvm/llvm-project/commit/d46fa023caa2db5a9f1e21dd038bcb626261d958
Author: Konrad Kleine <kkleine at redhat.com>
Date: 2022-04-20 (Wed, 20 Apr 2022)
Changed paths:
M clang/include/clang/Tooling/Inclusions/HeaderIncludes.h
M clang/lib/Format/Format.cpp
M clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
M clang/unittests/Format/SortIncludesTest.cpp
Log Message:
-----------
[clang-format] SortIncludes should support "@import" lines in Objective-C
Fixes [[ https://github.com/llvm/llvm-project/issues/38995 | #38995 ]]
This is an attempt to modify the regular expression to identify
`@import` and `import` alongside the regular `#include`. The challenging
part was not to support `@` in addition to `#` but how to handle
everything that comes after the `include|import` keywords. Previously
everything that wasn't `"` or `<` was consumed. But as you can see in
this example from the issue #38995, there is no `"` or `<` following the
keyword:
```
@import Foundation;
```
I experimented with a lot of fancy and useful expressions in [this
online regex tool](https://regex101.com) only to find out that some
things are simply not supported by the regex implementation in LLVM.
* For example the beginning `[\t\ ]*` should be replacable by the
horizontal whitespace character `\h*` but this will break the
`SortIncludesTest.LeadingWhitespace` test.
That's why I've chosen to come back to the basic building blocks.
The essential change in this patch is the change from this regular
expression:
```
^[\t\ ]*#[\t\ ]*(import|include)[^"<]*(["<][^">]*[">])
~ ~~~~~~~~~~~~~~
^ ^
| |
only support # prefix not @ |
only support "" and <> as
delimiters
no support for C++ modules and ;
ending. Also this allows for ">
or <" or "" or <> which all seems
either off or wrong.
```
to this:
```
^[\t\ ]*[@#][\t\ ]*(import|include)([^"]*("[^"]+")|[^<]*(<[^>]+>)|[\t\
]*([^;]+;))
~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~
~~~~~~~~~~~~~~
^ ^ ^ ^ ^
| | | | |
Now support @ and #. Clearly support "" and <> as
well as an
include name without enclosing
characters.
Allows for no mixture of "> or
<" or
empty include names.
```
Here is how I've tested this patch:
```
ninja clang-Format
ninja FormatTests
./tools/clang/unittests/Format/FormatTests
--gtest_filter=SortIncludesTest*
```
And if that worked I doubled checked that nothing else broke by running
all format checks:
```
./tools/clang/unittests/Format/FormatTests
```
One side effect of this change is it should partially support
[C++20 Module](https://en.cppreference.com/w/cpp/language/modules)
`import` lines without the optional `export` in front. Adding
this can be a change on its own that shouldn't be too hard. I say
partially because the `@` or `#` are currently *NOT* optional in the
regular expression.
I see an opportunity to optimized the matching to exclude `@include` for
example. But eventually these should be caught by the compiler, so...
With my change, the matching group is not at a fixed position any
longer. I decided to
choose the last match (group) that is not empty.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D121370
More information about the All-commits
mailing list