[cfe-dev] Repeated clang-format'ting keeps changing code / use of clang-format in CI

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Mon Aug 5 07:49:51 PDT 2019


On Sat, Aug 3, 2019 at 5:55 PM Sebastian Pipping <sebastian at pipping.org>
wrote:

> Hello Nico,
>
>
> On 29.07.19 15:16, Nico Weber wrote:
> > As you can see by the weird "default : break" at the end, clang-format
> > gets confused about the whole switch statement. It looks like the macros
> > confuse it. Just adding semicolons after it is sufficient to unconfuse
> it:
> >
> > [..]
> >   LEAD_CASE(2); LEAD_CASE(3); LEAD_CASE(4);
> > [..]
> >
> > If you're able to change expat's source, this might be a good approach.
>
> interesting idea!
>
>
> > Else, you can explicitly tell clang-format the name of macros that
> > should be handled as statements in your .clang-format file like so:
> >
> > StatementMacros: ['LEAD_CASE']
>
> Good to know!
>
>
> > You can fix this by doing `XCS("foo" "bar")` (with a linebreak in
> > between foo and bar) instead of `XCS("foo") XCS("bar")`. Semantically
> > they do the same thing as far as I can tell, but clang-format does much
> > better with the first form.
>
> Things get interesting here because XCS is either
>
>   # define XCS(s) _XCS(s)
>   # define _XCS(s) L ## s
>
> or
>
>   # define XCS(s) s
>
> depending of macro XML_UNICODE_WCHAR_T to turn string literals into
> wide or narrow string literals.  For the first version, XCS("foo" "bar")
> results in invalid C syntax, mixing wide L"foo" with narrow "bar".  As a
> result, I needed to turn off BreakStringLiterals altogether for now.
>

FWIW, C++11 added "If one string-literal has no encoding-prefix, it is
treated as a string-literal of the same encoding-prefix as the other
operand." to [lex.string]p13, so in C++11 `L"foo" "bar"` is the same as
`L"foo" L"bar"`. But granted, it's implementation-defined in C++ before
C++11.

The C standard says "If any of the tokens are wide string literal tokens,
the resulting multibyte character sequence is treated as a wide string
literal; otherwise, it is treated as a character string literal." in 6.4.5
String Literals p4. C89 still said "If a character string literal token is
adjacent to a wide string literal token, the behavior is undefined", I
think this changed in C99.

So if you need to support compilers that don't implement the C99 behavior
(which you likely do need to do :) ), then you're right.

We should probably have a bug for making clang-format handle string
literals surrounded by a macro call without semicolons that aren't
in StatementMacros the same as bare string literals.


>
> Best
>
>
>
> Sebastian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190805/77b0c465/attachment.html>


More information about the cfe-dev mailing list