[cfe-dev] How does clang-format parse snippets?

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Wed Aug 1 06:01:00 PDT 2018


clang-format is token-based, it sees the raw token stream (without even
running the preprocessor, which is why it doesn't need -I and -D flags).
clang's cc1 flag -dump-raw-tokens shows you what clang-format sees as
input. Note how it #if 0 gets printed instead of evaluated:

$ cat foo.cc
#if 0
asdf
#endif
int f();

$ bin/clang -c -Xclang -dump-raw-tokens foo.cc
hash '#' [StartOfLine] Loc=<foo.cc:1:1>
raw_identifier 'if' Loc=<foo.cc:1:2>
unknown ' ' Loc=<foo.cc:1:4>
numeric_constant '0' Loc=<foo.cc:1:5>
unknown '
' Loc=<foo.cc:1:6>
raw_identifier 'asdf' [StartOfLine] Loc=<foo.cc:2:1>
unknown '
' Loc=<foo.cc:2:5>
hash '#' [StartOfLine] Loc=<foo.cc:3:1>
raw_identifier 'endif' Loc=<foo.cc:3:2>
unknown '

' Loc=<foo.cc:3:7>
raw_identifier 'int' [StartOfLine] Loc=<foo.cc:5:1>
unknown ' ' Loc=<foo.cc:5:4>
raw_identifier 'f' Loc=<foo.cc:5:5>
l_paren '(' Loc=<foo.cc:5:6>
r_paren ')' Loc=<foo.cc:5:7>
semi ';' Loc=<foo.cc:5:8>
unknown '
' Loc=<foo.cc:5:9>


clang-format then has a bunch of heuristics to decide if `a * b` is a
multiplication or a declaration, but since it doesn't build an AST as you
say, it doesn't know if "a" in two different places refer to the same
variable. So in general it can't be used for most automated refactorings,
since you usually need ASTs for that.

(clang-format works great for formatting the output of an automated
refactoring though.)

On Wed, Aug 1, 2018 at 5:25 AM Stuart Thomson via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi,
>
>
>
> I’m interested in using clang to refactor snippets of C++ for which I
> can’t produce an AST. AFAIK this precludes the use of clang tools like
> clang-check and I wondered if clang-format could be used instead as it
> doesn’t seem to require the production of an AST. I don’t quite understand
> how clang-format works and have a couple of questions:
>
>
>
>    1. Is it possible to somehow use clang-format for refactoring C++
>    according to custom rules? These refactors would be larger scale things
>    than it seems to usually be used for.
>    2. How does clang-format parse C++ without e.g. parsing the includes?
>
>
>
> Thanks,
>
> Stuart
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180801/aca73976/attachment.html>


More information about the cfe-dev mailing list