[cfe-dev] [RFC] Adding support for clang-format making further code modifying changes

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Mon Aug 9 13:53:29 PDT 2021

On Mon, Aug 9, 2021 at 1:44 PM Andrew Tomazos via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> I'd just offer a few thoughts:
> - Whether or not to adopt clang-format for a project is already quite
> controversial in some places.  I think if we made it so that it also made
> more extensive non-whitespace changes, I think it would reduce
> its adoption, even if they were off by default.

Yeah? I wouldn't've figured that.

- Non-whitespace changes are very dangerous.  A bug in clang-format could
> more easily create bugs in the code it formats (at least that is the
> perception).  For some risk-averse projects, this would be an
> unacceptable risk.
> - Perhaps an alternative idea would be to create a new program called
> clang-reshape or something, that shared a common underlying framework with
> clang-format, but clang-reshape contains the options for both the
> non-whitespace changes and whitespace-changes, whereas clang-format only
> has the whitespace changes.  That way folks could adopt clang-format and
> ban clang-reshape if they only wanted whitespace changes.

This crossed my mind too - but I expect a lot of the clang-format
integration may be more invasive than that. (hardcoded binary names, use of
clang-format as a library, etc) - adding new features to the existing
adopted clang-format may be especially valuable compared to providing a new
tool. (that said, I do also feel like grouping features in an existing tool
only because of its existing adoption is super great either)

- Dave

> HTH. -Andrew
> On Mon, Aug 9, 2021 at 7:36 PM MyDeveloper Day via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>> Hi all,
>> As a frequent user and maintainer of clang-format I would like to
>> formalize a change to the "charter" of what clang-format can/could/should do
>> Motivation
>> ==========
>> As you all know clang-format parses source code and can manipulate the
>> tokens to produce consistently formatted code (most of the time), its
>> become ubiquitous in the industry and its integration into
>> popular editors & IDEs such as vim/visual studio/code mean it very simple
>> for users to get setup and running producing good looking code.
>> clang-format does not use semantic information, and as such it doesn't
>> need includes, defines or compiler flags to interpret code. Clang-format is
>> generally guessing that certain sequences of tokens from the lexer
>> represent certain patterns. It's a good guess but it gets it wrong from
>> time to time, hence trading correctness for performance.
>> Because of this clang-format is fast (not maybe as fast as we'd like) but
>> fast enough to be part of in a "save" operation such that the code is
>> formatted as the ide saves your work.
>> Mostly clang-format manipulates only whitespace, but over the years there
>> have been a number of extremely useful features that have broken this rule,
>> namely include sorting, namespace commenting to name a few.
>> The usage scenario of clang-format has also changed slightly to include a
>> non modifying advisory role identifying clang-format violations (as in our
>> own llvm-premerge checks), which can greatly aid the code review process by
>> removing the need to constantly ask for the code to be formatted correctly
>> or follow the LLVM convention.
>> Recently a number of new features have been proposed that would also
>> alter code, insertion of braces, east/west const conversion that can be
>> performed at "save" speeds.
>> As no semantic information is present, this raises the question as to
>> whether clang-format could actually break your code.
>> This has actually always been the case especially since the introduction
>> of include sorting, but also we all know that clang-format can break your
>> code from the visual perspective too and hence the need for // clang-format
>> off/on
>> In the most part include sorting not only might break your code noisily
>> such that it won't compile, but it can also break it silently,
>> and because IncludeSorting is on by default this breakage could
>> potentially go unnoticed.
>> https://stackoverflow.com/questions/37927553/can-clang-format-break-my-code
>> https://travisdowns.github.io/blog/2019/11/19/toupper.html
>> I don't think it can be in any doubt that IncludeSorting is a nice
>> feature used by 100,000's of developers without too many issues, but there
>> is some suggestion that its inclusion as "on by default" was potentially a
>> mistake.
>> Proposals for other new features that modify the code in a similar way
>> are getting some push back for changing the "charter" of clang-format when
>> it could be considered to have already changed.
>> This is causing friction on the review of some features and so it was
>> suggested to present an RFC to gain wider consensus on the concept of
>> clang-format changing code.
>> Mostly when a code modifying change is submitted the view is that this
>> isn't clang-formats job but more clang-tidy, however clang-tidy requires
>> full access to all include files and compiler definitions and only works on
>> the preprocessor paths of the code you are analyzing for and its speed and
>> hence its frequency of use is drastically reduced.
>> Some clang-format based modifications can in theory be made with a
>> relatively high level of confidence without paying the price and the
>> configuration complexity of getting all the
>> semantic information. https://reviews.llvm.org/D105701.
>> There is potentially for clang-format to introduce breaking changes and
>> whilst this could in theory cause noisy breakages they could also in theory
>> produce silent ones.
>> These new features want to be run at "reformat" speeds & frequency and
>> benefit from the rich Ecosystem of inclusion and integration in IDEs and
>> editors that clang-format enjoys.
>> This RFC is to try to gain some consensus as to what clang-format can do
>> and what the conditions/constraints should be if allowed to do so.
>> Benefits
>> ========
>> The benefits are that clang-format can be used to further make code
>> modifications to adhere to a coding convention (insertion/removal of
>> braces),
>> clang-format could be used to validate and correct coding convention
>> (left/right const),  and could be used to remove unnecessary semicolons or
>> potentially convert code to trailing return types all of which could be
>> performed at "reformat" speeds.
>> Whilst some of these capabilities are available in clang-tidy, it
>> requires significant infrastructure to often perform these often relatively
>> simple operations and it's unlikely
>> that all users of clang-format are set up to perform these actions in
>> clang-tidy.
>> There are likely a number of clang-tidy modifications that could in
>> theory be made at "reformat" speeds with such an approach. But there really
>> needs some agreement that it's OK for clang-format to modify the code.
>> Allowing these kinds of modification capabilities could lead to a new set
>> of "Resharper" style capabilities being added to clang-format,
>> capable of bringing source code quickly into line with coding conventions.
>> Concerns
>> ========
>> Correctness is King, the concern is your formatting tool should not
>> perform operations that could break your code. (this is already the case)
>> It's perhaps not clang-format's job to do this.
>> I should personally declare myself as being in favor of allowing
>> clang-format to modify code, I think it only fair that I let others reply
>> to the RFC with their own concerns.
>> Constraints
>> ===========
>> To minimize the impact to existing users, We suggest that a number of
>> constraints be generally considered good practice when submitting reviews
>> for clang-format with modifying changes
>> 1) Code Modifying Features should always be off by default
>> The user should have to make a positive affirmative action to use such a
>> feature
>> 2) Code Modifying Features configuration options should be highlighted as
>> such in the ClangFormatStyleOptions.rst such that its clear these are
>> potentially code breaking options
>> 3) Existing "Code Modifying Features" do not have to adhere to 1) but the
>> documentation should be updated to adhere to 2)
>> 4) Code Modifying Features should be conservative to be "correct first"
>> before being "complete".
>> i.e. If it's possible a change could be ambiguous it should tend towards
>> not making the incorrect change at all rather than forcing an incorrect
>> change. (This could cause some
>>     cases to be missed)
>> Request
>> =======
>> I would like to get some general agreement that it's OK for future
>> reviews of code modification changes to become part of clang-format (as
>> they are in IncludeSorting) assuming the best practices are
>> followed to protect users from unwanted changes.
>> Feedback would be appreciated
>> MyDeveloperDay
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210809/55978b1f/attachment-0001.html>

More information about the cfe-dev mailing list