[cfe-dev] Design: clang-format

Manuel Klimek klimek at google.com
Sun May 13 13:30:29 PDT 2012


On Fri, May 11, 2012 at 7:26 PM, Chris Lattner <clattner at apple.com> wrote:

>
> On May 11, 2012, at 9:27 AM, Manuel Klimek wrote:
>
> Hi,
>
> we're working on the design of clang-format; there are quite a few open
> questions, but I'd rather get some early feedback to see whether we're
> completely off track somewhere.
>
>
> Wow, having something like this would be great!
>
> For those of you who prefer good old email, here is a copy of the current
> state.
>
>
> +1 thanks :)
>
> *Context**While many other languages have auto-formatters available, C++
> is still lacking a tool that fits the needs of the majority of C++
> programmers. Note that when we talk about formatting as part of this
> document, we mean both the problem of indentation (which has been largely
> solved independently by regexp-based implementations in editors / IDEs) and line
> breaking, which proves to be a harder problem.
> *
>
>
> Also variable naming, use of #includes, etc?  How much of
> http://llvm.org/docs/CodingStandards.html is realistically
> enforcable/detectable?
>

I think we want 2 tools:
1. The one proposed here, which deals with formatting; ordering includes is
a corner case, doing static analysis to figure out iwyu or reverse-iwyu
style rules is a clear non-gloal
2. What I call a "lint-style" tool, which will start right where
clang-format leaves off and go deep into static analysis; the input format
here I'd imagine to be more like "patterns" or "rules", whereas for
clang-format I'm basically imagining a lot of bool values and some numbers
for the configuration.


>
> *Goals
>
>    - Format a whole file according to a configuration
>    - Format a part of a file according to a configuration
>    - Format a part of a file while being consistent as best as possible
>    with the rest of the file, while falling back to a configuration for
>    options that cannot be deduced from the current file
>    - Integrating with editors so that you can just type away until you’re
>    far past the column limit, and then hit a key and have the editor layout
>    the code for you, including placing the right line breaks
>
> *
>
>
> Some wishlist items from me:
>   - A "enforcer" mode that could be used in a post-commit script to find
> violations of the style.
>

That should fall out naturally.


>   - A "scanner" mode that could be used to scan a corpus of existing code
> to find the dominant style, instead of having to manually configure a
> thousand arguments like indent.
>

Definitely an interesting idea, and something to keep in mind - I don't
know whether that would be one of the highest prio goals, but we should
make it possible architecture-wise.


>
> *
>
>
> Non-goals
>
>    - Indenting code while you type; this is a much simpler problem, but
>    has even stronger performance requirements - the current editors should be
>    good enough, and we’ll allow new workflows that don’t ever require the user
>    to break lines
>
> *
>
>
> Make sense, this is a different problem.
>
> *
>
>    - The only lexical elements clang-format should touch are:
>    whitespaces, string-literals and comments. Any other changes ranging from
>    ordering includes to removing superfluous paranthesis are not in the scope
>    of this tool.
>
> *
>
> *
>
>    - Per-file configuration: be able to annotate a file with a style
>    which it adheres to (?)
>
> *
>
>
> If successful, the tool will probably be feature crept to support these.
>  I think it is completely sensible to subset these out from any initial
> implementation though: best to solve some small problems well (and then
> grow in scope) than to try to solve all problems and never got to a point
> where it is useful.
>

I also think we can give sensible alternatives to some...


>
> *
>
>
> Code locationClang-format is a very basic tool, so it might warrant
> living in clang mainstream. On the other hand it would also fit nicely with
> other clang refactoring tools. TODO: Where do we want clang-format to
> live?
> *
>
>
> No strong feeling.
>
> * Parsing approachThe key consideration is whether clang-format can be
> based purely on a lexer, or whether it needs type information, and we
> need the full AST.
>
> We believe that we will need the full AST information to correctly indent
> code, break lines, and fix whitespace within a line.
> *
>
>
> The major tradeoff here is that requiring an AST "requires" valid code and
> information on how to simulate the build.  If you can use just the lexer,
> then you can run on a random header file in isolation.
>
> Perhaps it is possible to subset and layer things so that some stuff works
> with just the lexer (e.g. 80 column detection) but other stuff requires
> more integration with AST and build info?
>

The very first draft I had that I didn't send to the list actually had this
one question at its core: how much can we do with the lexer only?
The problem is that I think we'll not be able to do sufficiently better
than standard-regexp-based solutions that it's worth the effort. Even
indenting needs types (as Richard pointed out) when templates are involved.


>
> *ConfigurationTo support a majority of developers, being able to
> configure the desired style is key. We propose using a YAML configuration
> file, as there’s already a YAML parser readily available in LLVM. Proposals
> for more specific ideas welcome.
> *
>
>
> Makes sense to me.
>
> *Style deductionWhen changing the format of code that does not conform to
> a given style configuration, we will optionally try to deduce style options
> from the file first, and fall back to the configured layout when there was
> no clear style deducible from the context.
> *
>
>
> +100 :)
>

Thanks for you input!
/Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120513/7ad4369a/attachment.html>


More information about the cfe-dev mailing list