[cfe-dev] Design: clang-format
Manuel Klimek
klimek at google.com
Sun May 13 13:33:50 PDT 2012
On Fri, May 11, 2012 at 7:28 PM, Matthieu Monrocq <
matthieu.monrocq at gmail.com> wrote:
>
>
> On Fri, May 11, 2012 at 6:27 PM, Manuel Klimek <klimek at google.com> wrote:
>
>> Hi,
>>
>> we're working on the design of clang-format; there are quite a few open
>> questions, but I'd rather get some early feedback to see whether we're
>> completely off track somewhere.
>>
>> The main doc is on google docs:
>>
>> https://docs.google.com/document/d/1gpckL2U_6QuU9YW2L1ABsc4Fcogn5UngKk7fE5dDOoA/edit
>>
>> For those of you who prefer good old email, here is a copy of the current
>> state. Feel free to add comments in either.
>> *Design: clang-format This document contains a design proposal for a
>> clang-format tool, which allows C++ developers to automatically format
>> their code independently of their development environment.
>> ContextWhile many other languages have auto-formatters available, C++ is
>> still lacking a tool that fits the needs of the majority of C++
>> programmers. Note that when we talk about formatting as part of this
>> document, we mean both the problem of indentation (which has been
>> largely solved independently by regexp-based implementations in editors /
>> IDEs) and line breaking, which proves to be a harder problem.
>>
>> There are multiple challenges to formatting C++ code:
>>
>> - a vast number of different coding styles has evolved over time
>> - many projects value consistency over conformance and dislike
>> style-only changes, thus making it important to be able to work with code
>> that is not written according to the most current style guide
>> - macros need to be handled properly
>> - it should be possible to format code that is not yet syntactically
>> correct
>>
>> Goals
>>
>> - Format a whole file according to a configuration
>> - Format a part of a file according to a configuration
>> - Format a part of a file while being consistent as best as possible
>> with the rest of the file, while falling back to a configuration for
>> options that cannot be deduced from the current file
>> - Integrating with editors so that you can just type away until
>> you’re far past the column limit, and then hit a key and have the editor
>> layout the code for you, including placing the right line breaks
>>
>> Non-goals
>>
>> - Indenting code while you type; this is a much simpler problem, but
>> has even stronger performance requirements - the current editors should be
>> good enough, and we’ll allow new workflows that don’t ever require the user
>> to break lines
>> - The only lexical elements clang-format should touch are:
>> whitespaces, string-literals and comments. Any other changes ranging from
>> ordering includes to removing superfluous paranthesis are not in the scope
>> of this tool.
>>
>> *
>>
>
> Oh...
>
> I have 2 remarks here.
>
> 1. The position of `const` and `volatile` qualifiers.
>
> C++ allows having them either before or after the type they qualify (at
> the lower level). LLVM recommends putting them before (looks more English I
> guess) while I have seen other guides (and I prefer) systematically putting
> them after (for consistency, and I am French anyway!).
>
This one sounds like it's in scope, although I don't know how far up on the
prio list it will be ... (meaning: probably not very high ;)
> 2. The addition/removal of brackets for inline blocks
>
> In C++, an `if`, `else`, `for`, `while` (not sure about `do` `while`) can
> be followed either by a block (with {}) or a single statement. Once again,
> purely a matter of style. LLVM recommends not putting them for example.
>
I think this is out of scope, but it's definitely borderline. But I'm not
sure. Let's solve the core questions first and figure out details inside
the issue tracker later ;)
> It seems to me that both would fit perfectly into a style formatter.
>
>
> *
>>
>> - Per-file configuration: be able to annotate a file with a style
>> which it adheres to (?)
>>
>> *
>>
>
> Perhaps a per-folder configuration file (and naturally inheriting from the
> parent folder if none available). And the ability to specialize the style
> for a few files within that configuration file, though it seems a bit
> overkill to go down to that level of details.
>
Yea, a per-folder configuration definitely sounds like a good idea, as
we'll probably want to traverse the directory tree to find the
configuration anyway.
Thanks for your input!
/Manuel
>
>
>> *
>>
>>
>> Code locationClang-format is a very basic tool, so it might warrant
>> living in clang mainstream. On the other hand it would also fit nicely with
>> other clang refactoring tools. TODO: Where do we want clang-format to
>> live?
>> Parsing approachThe key consideration is whether clang-format can be
>> based purely on a lexer, or whether it needs type information, and we
>> need the full AST.
>>
>> We believe that we will need the full AST information to correctly indent
>> code, break lines, and fix whitespace within a line.
>>
>> Examples:
>>
>> AST-dependent indentation:
>> callFunction(foo<something,
>> ^ line up here, if foo is a template name
>> ^ line up here otherwise
>>
>> AST-dependent line breaking:
>> Detecting that ‘*’ is an binary operator in this case requires parsing;
>> if it is a binary operator, we want to line-break after it, if it is a
>> unary operator, we want to prevent line breaking
>>
>> result = variable1 * variable2;
>>
>> AST-dependent whitespace inside lines:
>> a * b;
>> ^ Binary operator or pointer declaration?
>> a & f();
>> ^ Binary operator or function declaration?
>>
>> Challenge: Preprocessor
>> Not every line in a program is covered by the AST - for example, there
>> are unused macro definitions, various preprocessor directives, #ifdef’ed
>> out code, etc.
>>
>> We will at least need some form of lexing approach for the parts of a
>> source file that cannot be correctly indented / line broken by looking at
>> the AST.
>>
>> Algorithm Visit all nodes on the AST; for each node that is part of a
>> macro expansion, consider all locations taking part in that macro
>> expansion. If the location is within the range that need to be indented,
>> look at the code at the location, the rules around the node, and adjust
>> whitespace as necessary. If the node starts a line, adjust the indent; if a
>> node overflows the line, break the line. TODO: figure out what to do with
>> the lines that are not visited that way.
>> ConfigurationTo support a majority of developers, being able to
>> configure the desired style is key. We propose using a YAML configuration
>> file, as there’s already a YAML parser readily available in LLVM. Proposals
>> for more specific ideas welcome.
>> Style deductionWhen changing the format of code that does not conform to
>> a given style configuration, we will optionally try to deduce style options
>> from the file first, and fall back to the configured layout when there was
>> no clear style deducible from the context.
>> TODO: Detailed design ideas.
>> Interface This is a strawman. Please shoot down.
>>
>> Command line interface:
>> Command line interfaces allow easy integration with existing tools and
>> editors.
>>
>> USAGE: clang-format <build-path> <source> [<column0> <line0> <length0>
>> [...]] [-- list of command line arguments to the parser]
>>
>> <columnN> <lineN> <lengthN>: Specifies a code range to be reformatted; if
>> no code range is given, assume the whole file.
>>
>> Code level interface:
>> Reformatting source code is also a prerequisite for automated refactoring
>> tools. We want to be able to integrate the reformatting as a
>> post-processing step on top of other code transformations to make sure as
>> little human intervention is needed as possible.
>> CompetitionTODO: List other formatting tools we’re aware of and how well
>> they work
>>
>> - GNU ident - C only;
>> - BCPP (http://invisible-island.net/bcpp/bcpp.html) - “it does (by
>> design) not attempt to wrap long statements”; written in about 1995, since
>> then had very few changes;
>> - Artistic Style (http://astyle.sourceforge.net/) - one of the most
>> frequently used, but “not perfect”;
>> - Uncrustify (http://uncrustify.sourceforge.net/) - has lots of
>> configuration options;
>> - GreatCode (http://sourceforge.net/projects/gcgreatcode/) - not
>> supported since 2005;
>> - Style Revisor (http://style-revisor.com) - commercial; claims to
>> understand C++, but it isn’t released yet, so no way to try; uses code
>> snippets to specify rules.
>>
>>
>> All of them except Style Revisitor seem to have simplistic regexp-based
>> c++ parsing.*
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120513/5116d90b/attachment.html>
More information about the cfe-dev
mailing list