[cfe-dev] [RFC] Adding support for clang-format making further code modifying changes

Andrew Tomazos via cfe-dev cfe-dev at lists.llvm.org
Mon Aug 9 13:44:28 PDT 2021


I'd just offer a few thoughts:

- Whether or not to adopt clang-format for a project is already quite
controversial in some places.  I think if we made it so that it also made
more extensive non-whitespace changes, I think it would reduce
its adoption, even if they were off by default.

- Non-whitespace changes are very dangerous.  A bug in clang-format could
more easily create bugs in the code it formats (at least that is the
perception).  For some risk-averse projects, this would be an
unacceptable risk.

- Perhaps an alternative idea would be to create a new program called
clang-reshape or something, that shared a common underlying framework with
clang-format, but clang-reshape contains the options for both the
non-whitespace changes and whitespace-changes, whereas clang-format only
has the whitespace changes.  That way folks could adopt clang-format and
ban clang-reshape if they only wanted whitespace changes.

HTH. -Andrew


On Mon, Aug 9, 2021 at 7:36 PM MyDeveloper Day via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi all,
>
> As a frequent user and maintainer of clang-format I would like to
> formalize a change to the "charter" of what clang-format can/could/should do
>
> Motivation
> ==========
>
> As you all know clang-format parses source code and can manipulate the
> tokens to produce consistently formatted code (most of the time), its
> become ubiquitous in the industry and its integration into
> popular editors & IDEs such as vim/visual studio/code mean it very simple
> for users to get setup and running producing good looking code.
>
> clang-format does not use semantic information, and as such it doesn't
> need includes, defines or compiler flags to interpret code. Clang-format is
> generally guessing that certain sequences of tokens from the lexer
> represent certain patterns. It's a good guess but it gets it wrong from
> time to time, hence trading correctness for performance.
>
> Because of this clang-format is fast (not maybe as fast as we'd like) but
> fast enough to be part of in a "save" operation such that the code is
> formatted as the ide saves your work.
>
> Mostly clang-format manipulates only whitespace, but over the years there
> have been a number of extremely useful features that have broken this rule,
> namely include sorting, namespace commenting to name a few.
>
> The usage scenario of clang-format has also changed slightly to include a
> non modifying advisory role identifying clang-format violations (as in our
> own llvm-premerge checks), which can greatly aid the code review process by
> removing the need to constantly ask for the code to be formatted correctly
> or follow the LLVM convention.
>
> Recently a number of new features have been proposed that would also alter
> code, insertion of braces, east/west const conversion that can be performed
> at "save" speeds.
>
> As no semantic information is present, this raises the question as to
> whether clang-format could actually break your code.
> This has actually always been the case especially since the introduction
> of include sorting, but also we all know that clang-format can break your
> code from the visual perspective too and hence the need for // clang-format
> off/on
>
> In the most part include sorting not only might break your code noisily
> such that it won't compile, but it can also break it silently,
> and because IncludeSorting is on by default this breakage could
> potentially go unnoticed.
>
> https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code
> https://travisdowns.github.io/blog/2019/11/19/toupper.html
>
> I don't think it can be in any doubt that IncludeSorting is a nice feature
> used by 100,000's of developers without too many issues, but there is some
> suggestion that its inclusion as "on by default" was potentially a mistake.
>
> Proposals for other new features that modify the code in a similar way are
> getting some push back for changing the "charter" of clang-format when it
> could be considered to have already changed.
> This is causing friction on the review of some features and so it was
> suggested to present an RFC to gain wider consensus on the concept of
> clang-format changing code.
>
> Mostly when a code modifying change is submitted the view is that this
> isn't clang-formats job but more clang-tidy, however clang-tidy requires
> full access to all include files and compiler definitions and only works on
> the preprocessor paths of the code you are analyzing for and its speed and
> hence its frequency of use is drastically reduced.
>
> Some clang-format based modifications can in theory be made with a
> relatively high level of confidence without paying the price and the
> configuration complexity of getting all the
> semantic information. https://reviews.llvm.org/D105701.
> There is potentially for clang-format to introduce breaking changes and
> whilst this could in theory cause noisy breakages they could also in theory
> produce silent ones.
>
> These new features want to be run at "reformat" speeds & frequency and
> benefit from the rich Ecosystem of inclusion and integration in IDEs and
> editors that clang-format enjoys.
>
> This RFC is to try to gain some consensus as to what clang-format can do
> and what the conditions/constraints should be if allowed to do so.
>
> Benefits
> ========
>
> The benefits are that clang-format can be used to further make code
> modifications to adhere to a coding convention (insertion/removal of
> braces),
> clang-format could be used to validate and correct coding convention
> (left/right const),  and could be used to remove unnecessary semicolons or
> potentially convert code to trailing return types all of which could be
> performed at "reformat" speeds.
>
> Whilst some of these capabilities are available in clang-tidy, it requires
> significant infrastructure to often perform these often relatively simple
> operations and it's unlikely
> that all users of clang-format are set up to perform these actions in
> clang-tidy.
>
> There are likely a number of clang-tidy modifications that could in theory
> be made at "reformat" speeds with such an approach. But there really needs
> some agreement that it's OK for clang-format to modify the code.
>
> Allowing these kinds of modification capabilities could lead to a new set
> of "Resharper" style capabilities being added to clang-format,
> capable of bringing source code quickly into line with coding conventions.
>
> Concerns
> ========
>
> Correctness is King, the concern is your formatting tool should not
> perform operations that could break your code. (this is already the case)
>
> It's perhaps not clang-format's job to do this.
>
> I should personally declare myself as being in favor of allowing
> clang-format to modify code, I think it only fair that I let others reply
> to the RFC with their own concerns.
>
> Constraints
> ===========
>
> To minimize the impact to existing users, We suggest that a number of
> constraints be generally considered good practice when submitting reviews
> for clang-format with modifying changes
>
> 1) Code Modifying Features should always be off by default
> The user should have to make a positive affirmative action to use such a
> feature
>
> 2) Code Modifying Features configuration options should be highlighted as
> such in the ClangFormatStyleOptions.rst such that its clear these are
> potentially code breaking options
>
> 3) Existing "Code Modifying Features" do not have to adhere to 1) but the
> documentation should be updated to adhere to 2)
>
> 4) Code Modifying Features should be conservative to be "correct first"
> before being "complete".
> i.e. If it's possible a change could be ambiguous it should tend towards
> not making the incorrect change at all rather than forcing an incorrect
> change. (This could cause some
>     cases to be missed)
>
> Request
> =======
>
> I would like to get some general agreement that it's OK for future reviews
> of code modification changes to become part of clang-format (as they are in
> IncludeSorting) assuming the best practices are
> followed to protect users from unwanted changes.
>
> Feedback would be appreciated
>
> MyDeveloperDay
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210809/de7f0608/attachment.html>


More information about the cfe-dev mailing list