[cfe-dev] Design: clang-format
Matthieu Monrocq
matthieu.monrocq at gmail.com
Fri May 11 10:28:11 PDT 2012
On Fri, May 11, 2012 at 6:27 PM, Manuel Klimek <klimek at google.com> wrote:
> Hi,
>
> we're working on the design of clang-format; there are quite a few open
> questions, but I'd rather get some early feedback to see whether we're
> completely off track somewhere.
>
> The main doc is on google docs:
>
> https://docs.google.com/document/d/1gpckL2U_6QuU9YW2L1ABsc4Fcogn5UngKk7fE5dDOoA/edit
>
> For those of you who prefer good old email, here is a copy of the current
> state. Feel free to add comments in either.
> *Design: clang-format This document contains a design proposal for a
> clang-format tool, which allows C++ developers to automatically format
> their code independently of their development environment.
> ContextWhile many other languages have auto-formatters available, C++ is
> still lacking a tool that fits the needs of the majority of C++
> programmers. Note that when we talk about formatting as part of this
> document, we mean both the problem of indentation (which has been largely
> solved independently by regexp-based implementations in editors / IDEs) and line
> breaking, which proves to be a harder problem.
>
> There are multiple challenges to formatting C++ code:
>
> - a vast number of different coding styles has evolved over time
> - many projects value consistency over conformance and dislike
> style-only changes, thus making it important to be able to work with code
> that is not written according to the most current style guide
> - macros need to be handled properly
> - it should be possible to format code that is not yet syntactically
> correct
>
> Goals
>
> - Format a whole file according to a configuration
> - Format a part of a file according to a configuration
> - Format a part of a file while being consistent as best as possible
> with the rest of the file, while falling back to a configuration for
> options that cannot be deduced from the current file
> - Integrating with editors so that you can just type away until you’re
> far past the column limit, and then hit a key and have the editor layout
> the code for you, including placing the right line breaks
>
> Non-goals
>
> - Indenting code while you type; this is a much simpler problem, but
> has even stronger performance requirements - the current editors should be
> good enough, and we’ll allow new workflows that don’t ever require the user
> to break lines
> - The only lexical elements clang-format should touch are:
> whitespaces, string-literals and comments. Any other changes ranging from
> ordering includes to removing superfluous paranthesis are not in the scope
> of this tool.
>
> *
>
Oh...
I have 2 remarks here.
1. The position of `const` and `volatile` qualifiers.
C++ allows having them either before or after the type they qualify (at the
lower level). LLVM recommends putting them before (looks more English I
guess) while I have seen other guides (and I prefer) systematically putting
them after (for consistency, and I am French anyway!).
2. The addition/removal of brackets for inline blocks
In C++, an `if`, `else`, `for`, `while` (not sure about `do` `while`) can
be followed either by a block (with {}) or a single statement. Once again,
purely a matter of style. LLVM recommends not putting them for example.
It seems to me that both would fit perfectly into a style formatter.
*
>
> - Per-file configuration: be able to annotate a file with a style
> which it adheres to (?)
>
> *
>
Perhaps a per-folder configuration file (and naturally inheriting from the
parent folder if none available). And the ability to specialize the style
for a few files within that configuration file, though it seems a bit
overkill to go down to that level of details.
> *
>
>
> Code locationClang-format is a very basic tool, so it might warrant
> living in clang mainstream. On the other hand it would also fit nicely with
> other clang refactoring tools. TODO: Where do we want clang-format to
> live?
> Parsing approachThe key consideration is whether clang-format can be
> based purely on a lexer, or whether it needs type information, and we
> need the full AST.
>
> We believe that we will need the full AST information to correctly indent
> code, break lines, and fix whitespace within a line.
>
> Examples:
>
> AST-dependent indentation:
> callFunction(foo<something,
> ^ line up here, if foo is a template name
> ^ line up here otherwise
>
> AST-dependent line breaking:
> Detecting that ‘*’ is an binary operator in this case requires parsing; if
> it is a binary operator, we want to line-break after it, if it is a unary
> operator, we want to prevent line breaking
>
> result = variable1 * variable2;
>
> AST-dependent whitespace inside lines:
> a * b;
> ^ Binary operator or pointer declaration?
> a & f();
> ^ Binary operator or function declaration?
>
> Challenge: Preprocessor
> Not every line in a program is covered by the AST - for example, there are
> unused macro definitions, various preprocessor directives, #ifdef’ed out
> code, etc.
>
> We will at least need some form of lexing approach for the parts of a
> source file that cannot be correctly indented / line broken by looking at
> the AST.
>
> Algorithm Visit all nodes on the AST; for each node that is part of a
> macro expansion, consider all locations taking part in that macro
> expansion. If the location is within the range that need to be indented,
> look at the code at the location, the rules around the node, and adjust
> whitespace as necessary. If the node starts a line, adjust the indent; if a
> node overflows the line, break the line. TODO: figure out what to do with
> the lines that are not visited that way.
> ConfigurationTo support a majority of developers, being able to configure
> the desired style is key. We propose using a YAML configuration file, as
> there’s already a YAML parser readily available in LLVM. Proposals for more
> specific ideas welcome.
> Style deductionWhen changing the format of code that does not conform to
> a given style configuration, we will optionally try to deduce style options
> from the file first, and fall back to the configured layout when there was
> no clear style deducible from the context.
> TODO: Detailed design ideas.
> Interface This is a strawman. Please shoot down.
>
> Command line interface:
> Command line interfaces allow easy integration with existing tools and
> editors.
>
> USAGE: clang-format <build-path> <source> [<column0> <line0> <length0>
> [...]] [-- list of command line arguments to the parser]
>
> <columnN> <lineN> <lengthN>: Specifies a code range to be reformatted; if
> no code range is given, assume the whole file.
>
> Code level interface:
> Reformatting source code is also a prerequisite for automated refactoring
> tools. We want to be able to integrate the reformatting as a
> post-processing step on top of other code transformations to make sure as
> little human intervention is needed as possible.
> CompetitionTODO: List other formatting tools we’re aware of and how well
> they work
>
> - GNU ident - C only;
> - BCPP (http://invisible-island.net/bcpp/bcpp.html) - “it does (by
> design) not attempt to wrap long statements”; written in about 1995, since
> then had very few changes;
> - Artistic Style (http://astyle.sourceforge.net/) - one of the most
> frequently used, but “not perfect”;
> - Uncrustify (http://uncrustify.sourceforge.net/) - has lots of
> configuration options;
> - GreatCode (http://sourceforge.net/projects/gcgreatcode/) - not
> supported since 2005;
> - Style Revisor (http://style-revisor.com) - commercial; claims to
> understand C++, but it isn’t released yet, so no way to try; uses code
> snippets to specify rules.
>
>
> All of them except Style Revisitor seem to have simplistic regexp-based
> c++ parsing.*
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120511/aea282ac/attachment.html>
More information about the cfe-dev
mailing list