[llvm-dev] [RFC] Semi-Automatic clang-format of files with low frequency

Tue Jun 30 13:56:22 PDT 2020

On Tue, Jun 30, 2020 at 8:19 PM MyDeveloper Day via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> I 100% get that we might not like the decisions clang-format is making, but how does one overcome this when adding new code? The pre-merge checks enforce clang-formatting before commit and that's a common review comment anyway for those who didn't join the pre-merge checking group. I'm just wondering are we not all following the same guidelines?
>
> Concerns of clang-format not being good enough are fair enough, but that's the area I'm focusing my development efforts on (as that's where I've been trying to make a small contribution). This effort was driven out of a need in my view to have an extensive test suite to validate new changes to clang-format don't break the formatting of pre-formatted code. This isn't that possible with LLVM at the moment because large parts are not formatted.
>
> However checking a code base which is in high flux but one that maintains a 100% clang-format clean status, is a near perfect test-suite. Especially one that I assume will have unit tests for the latest C++ features.
>
> I'm not bored ;-) and whilst I understand this feels somewhat periphery to the seemingly much more pressing task of developing the next best compiler, I think we have to remember that clang-format is extensively used. (In my view by many more people than those actually using clang). GitHub reports 32,000 YAML files with the BasedOnStyle: LLVM alone that is present in the .clang-format file

My comment there was more targeted against those who would take your
idea and push it to unnecessary and counterproductive extremes. I do
appreciate that you are trying to solve an actual problem _for the
development of clang-format_ :)

Since you're interested in thoughts about clang-format's capabilities,
I agree with Matt that the strictness of the approach is a weakness
that can frequently make code *less* readable. In addition to what he
mentions, here's an example of a bad change that clang-format wants to
make that I found after half a minute of scanning our AMDGPU backend
code:

   // We only support LOAD/STORE and vector manipulation ops for vectors
   // with > 4 elements.
-  for (MVT VT : { MVT::v8i32, MVT::v8f32, MVT::v16i32, MVT::v16f32,
-                  MVT::v2i64, MVT::v2f64, MVT::v4i16, MVT::v4f16,
-                  MVT::v4i64, MVT::v4f64, MVT::v8i64, MVT::v8f64,
-                  MVT::v16i64, MVT::v16f64, MVT::v32i32, MVT::v32f32 }) {
+  for (MVT VT :
+       {MVT::v8i32, MVT::v8f32, MVT::v16i32, MVT::v16f32, MVT::v2i64,
+        MVT::v2f64, MVT::v4i16, MVT::v4f16, MVT::v4i64, MVT::v4f64, MVT::v8i64,
+        MVT::v8f64, MVT::v16i64, MVT::v16f64, MVT::v32i32, MVT::v32f32}) {
     for (unsigned Op = 0; Op < ISD::BUILTIN_OP_END; ++Op) {
       switch (Op) {

I don't particularly mind that clang-format puts that initializer list
onto a new line, or that it changes the whitespace around braces. What
I do mind: the original code lays the initializer list out carefully
such that integer and floating point types always come in pairs on the
same line: v8[if]32 and v16[if]32 on the first line, v2[if]64 and
v4[if]64 on the second line, and so on. clang-format mindlessly mushes
the initializer list together in a way that makes it harder to see at
a glance what is going on.

(Note that this isn't code that I wrote, so I'm not emotionally
attached to it or anything. But I've run into this kind of problem
many times during development.)

I believe the common theme here is that clang-format ought to have a
mode in which it is more accepting of different layouts of lists
co-existing within the same code base. If that feature was available,
I would be a strong proponent for adopting it in LLVM itself.

Cheers,
Nicolai

> I realize it feels like unnecessary churn which is why I suggested this be in code which hasn't been touched in some time (I'd be fine if that time is 1+ years), but more often than not this is quite small basic style issues, similar to the ones below.
>
> MyDeveloperDay
>
> -  void HandleTranslationUnit(ASTContext& context) override {
> +  void HandleTranslationUnit(ASTContext &context) override {
>      if (!Instance.getLangOpts().DelayedTemplateParsing)
>        return;
>
> -      std::set<FunctionDecl*> LateParsedDecls;
> +      std::set<FunctionDecl *> LateParsedDecls;
>
> On Tue, Jun 30, 2020 at 6:55 PM Matt Arsenault <arsenm2 at gmail.com> wrote:
>>
>>
>>
>> On Jun 28, 2020, at 11:30, MyDeveloper Day via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> I’m a contributor to clang-format and would like to see LLVM 100% clang formatted so we can use LLVM as a massive test-suite for clang-format when we make changes.
>>
>>
>>
>> My main issue with this would be that clang-format does things that I don’t believe are stated in the LLVM style guide and I also disagree with. There’s a whole set of cases where it makes unwelcome changes to put things that were on separate lines on a single line. Whenever I run clang-format, I typically git checkout -p to discard all of these.
>>
>> For example, it likes to take member initializer lists and pack them into as few lines as possible. This is just asking for merge conflicts (e.g. AMDGPUSubtarget has a ton of fields in it, and out of tree code is constantly adding new fields for unreleased subtargets). It also mangles BuildMI calls, where I believe every .add() should be on its own line, and stringing this into BuildMI().addFoo().addBar() is way less readable.
>>
>> I also believe it’s 4 space indent on line wraps differs from the stated 2 space indent level (and this also disagrees with emacs llvm-style)
>>
>> -Matt
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.