[llvm-dev] [cfe-dev] [RFC] Adding support for clang-format making further code modifying changes

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 9 12:54:40 PDT 2021


+some of the folks involved in the early development of clang-format who
might have more context on those design principles/choices made back then
(& perhaps on the include reordering change/tradeoff/choices made there)

Anyone got a link to the include reordering change/design discussion for
it? Might be informative to see if there was precedent/philosophy clearly
stated about how that was acceptable and what principles were applied &
could be reapplied to more recent proposals.

I tend towards the status quo (in general - that's just my personality) and
tend to prefer the fairly firm line clang-format currently has. But I don't
have a great rationale for "why include reordering but not const reordering
or dropping braces, etc" - marginally: Changing C++ syntax is more
complicated than reordering headers. Despite the fact that reordering
headers can have a big impact/breakage, it's hopefully fairly clear if/when
that happens that headers should be made order-agnostic. The subtleties of
what might happen when adding/removing braces if the heuristic
quick-parsing of clang-format fail seem likely to be more unfortunate, to
me at least. But maybe we're at a point where clang-format is widely
adopted/tested enough that that's less likely to have subtle mis-parse
breakages for terribly long? Not sure.

- Dave

On Mon, Aug 9, 2021 at 12:36 PM MyDeveloper Day via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi all,
>
> As a frequent user and maintainer of clang-format I would like to
> formalize a change to the "charter" of what clang-format can/could/should do
>
> Motivation
> ==========
>
> As you all know clang-format parses source code and can manipulate the
> tokens to produce consistently formatted code (most of the time), its
> become ubiquitous in the industry and its integration into
> popular editors & IDEs such as vim/visual studio/code mean it very simple
> for users to get setup and running producing good looking code.
>
> clang-format does not use semantic information, and as such it doesn't
> need includes, defines or compiler flags to interpret code. Clang-format is
> generally guessing that certain sequences of tokens from the lexer
> represent certain patterns. It's a good guess but it gets it wrong from
> time to time, hence trading correctness for performance.
>
> Because of this clang-format is fast (not maybe as fast as we'd like) but
> fast enough to be part of in a "save" operation such that the code is
> formatted as the ide saves your work.
>
> Mostly clang-format manipulates only whitespace, but over the years there
> have been a number of extremely useful features that have broken this rule,
> namely include sorting, namespace commenting to name a few.
>
> The usage scenario of clang-format has also changed slightly to include a
> non modifying advisory role identifying clang-format violations (as in our
> own llvm-premerge checks), which can greatly aid the code review process by
> removing the need to constantly ask for the code to be formatted correctly
> or follow the LLVM convention.
>
> Recently a number of new features have been proposed that would also alter
> code, insertion of braces, east/west const conversion that can be performed
> at "save" speeds.
>
> As no semantic information is present, this raises the question as to
> whether clang-format could actually break your code.
> This has actually always been the case especially since the introduction
> of include sorting, but also we all know that clang-format can break your
> code from the visual perspective too and hence the need for // clang-format
> off/on
>
> In the most part include sorting not only might break your code noisily
> such that it won't compile, but it can also break it silently,
> and because IncludeSorting is on by default this breakage could
> potentially go unnoticed.
>
> https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code
> https://travisdowns.github.io/blog/2019/11/19/toupper.html
>
> I don't think it can be in any doubt that IncludeSorting is a nice feature
> used by 100,000's of developers without too many issues, but there is some
> suggestion that its inclusion as "on by default" was potentially a mistake.
>
> Proposals for other new features that modify the code in a similar way are
> getting some push back for changing the "charter" of clang-format when it
> could be considered to have already changed.
> This is causing friction on the review of some features and so it was
> suggested to present an RFC to gain wider consensus on the concept of
> clang-format changing code.
>
> Mostly when a code modifying change is submitted the view is that this
> isn't clang-formats job but more clang-tidy, however clang-tidy requires
> full access to all include files and compiler definitions and only works on
> the preprocessor paths of the code you are analyzing for and its speed and
> hence its frequency of use is drastically reduced.
>
> Some clang-format based modifications can in theory be made with a
> relatively high level of confidence without paying the price and the
> configuration complexity of getting all the
> semantic information. https://reviews.llvm.org/D105701.
> There is potentially for clang-format to introduce breaking changes and
> whilst this could in theory cause noisy breakages they could also in theory
> produce silent ones.
>
> These new features want to be run at "reformat" speeds & frequency and
> benefit from the rich Ecosystem of inclusion and integration in IDEs and
> editors that clang-format enjoys.
>
> This RFC is to try to gain some consensus as to what clang-format can do
> and what the conditions/constraints should be if allowed to do so.
>
> Benefits
> ========
>
> The benefits are that clang-format can be used to further make code
> modifications to adhere to a coding convention (insertion/removal of
> braces),
> clang-format could be used to validate and correct coding convention
> (left/right const),  and could be used to remove unnecessary semicolons or
> potentially convert code to trailing return types all of which could be
> performed at "reformat" speeds.
>
> Whilst some of these capabilities are available in clang-tidy, it requires
> significant infrastructure to often perform these often relatively simple
> operations and it's unlikely
> that all users of clang-format are set up to perform these actions in
> clang-tidy.
>
> There are likely a number of clang-tidy modifications that could in theory
> be made at "reformat" speeds with such an approach. But there really needs
> some agreement that it's OK for clang-format to modify the code.
>
> Allowing these kinds of modification capabilities could lead to a new set
> of "Resharper" style capabilities being added to clang-format,
> capable of bringing source code quickly into line with coding conventions.
>
> Concerns
> ========
>
> Correctness is King, the concern is your formatting tool should not
> perform operations that could break your code. (this is already the case)
>
> It's perhaps not clang-format's job to do this.
>
> I should personally declare myself as being in favor of allowing
> clang-format to modify code, I think it only fair that I let others reply
> to the RFC with their own concerns.
>
> Constraints
> ===========
>
> To minimize the impact to existing users, We suggest that a number of
> constraints be generally considered good practice when submitting reviews
> for clang-format with modifying changes
>
> 1) Code Modifying Features should always be off by default
> The user should have to make a positive affirmative action to use such a
> feature
>
> 2) Code Modifying Features configuration options should be highlighted as
> such in the ClangFormatStyleOptions.rst such that its clear these are
> potentially code breaking options
>
> 3) Existing "Code Modifying Features" do not have to adhere to 1) but the
> documentation should be updated to adhere to 2)
>
> 4) Code Modifying Features should be conservative to be "correct first"
> before being "complete".
> i.e. If it's possible a change could be ambiguous it should tend towards
> not making the incorrect change at all rather than forcing an incorrect
> change. (This could cause some
>     cases to be missed)
>
> Request
> =======
>
> I would like to get some general agreement that it's OK for future reviews
> of code modification changes to become part of clang-format (as they are in
> IncludeSorting) assuming the best practices are
> followed to protect users from unwanted changes.
>
> Feedback would be appreciated
>
> MyDeveloperDay
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210809/1e3432dd/attachment-0001.html>


More information about the llvm-dev mailing list