[clang-tools-extra] dccebdd - Finally formalise our defacto line-ending policy

Luke Drummond via cfe-commits cfe-commits at lists.llvm.org
Thu Oct 17 06:49:42 PDT 2024


Author: Luke Drummond
Date: 2024-10-17T14:47:54+01:00
New Revision: dccebddb3b802c4c1fe287222e454b63f850f012

URL: https://github.com/llvm/llvm-project/commit/dccebddb3b802c4c1fe287222e454b63f850f012
DIFF: https://github.com/llvm/llvm-project/commit/dccebddb3b802c4c1fe287222e454b63f850f012.diff

LOG: Finally formalise our defacto line-ending policy

Historically, we've not automatically enforced how git tracks line
endings, but there are many, many commits that "undo" unintended CRLFs
getting into history.

`git log --pretty=oneline --grep=CRLF` shows nearly 100 commits
involving reverts of CRLF making its way into the index and then
history. As far as I can tell, there are none the other way round except
for specific cases like `.bat` files or tests for parsers that need to
accept such sequences.

Of note, one of the earliest of those listed in that output is:

```
  commit 9795860250734e5c2a879546c534e35d9edd5944
  Author: NAKAMURA Takumi <geek4civic at gmail.com>
  Date:   Thu Feb 3 11:41:27 2011 +0000

      cmake/*: Add svn:eol-style=native and fix CRLF.

      llvm-svn: 124793
```

...which introduced such a defacto policy for subversion.

With old versions of git, it's been a bit of a crap-shoot whether
enforcing storing line endings in the history will upset checkouts on
machines where such line endings are the norm. Indeed many users have
enforced that git checks out the working copy according to a global or
per-user config via core crlf, or core autocrlf.

For ~8 years now[1], however, git has supported the ability to "do as
the Romans do" on checkout, but internally store subsets of text files
with line-endings specified via a system of patterns in the
`.gitattributes` file. Since we now have this ability, and we've been
specifying attributes for various binary files, I think it makes sense
to rid us of all that work converting things "back", and just let git
handle the local checkout. Thus the new toplevel policy here is

    * text=auto

In simple terms this means "unless otherwise specified, convert all
files considered "text" files to LF in the project history, but check
them out as expected on the local machine. What is "expected on the
local machine" is dependent on configuration and default.

For those files in the repository that *do* need CRLF endings, I've
adopted a policy of `eol=crlf` which means that git will store them in
history with LF, but regardless of user config, they'll be checked out
in tree with CRLF.

Finally, existing files have been "corrected" in history via `git add
--renormalize .`

End users should *not* need to adjust their local git config or
workflow.

[1]: git 2.10 was released with fixed support for fine-grained
line-ending tracking that respects user-config *and* repo policy. This
can be considered the point at which git will respect both the user's
local working tree preference *and* the history as specified by the
maintainers. See
https://github.com/git/git/blob/master/Documentation/RelNotes/2.10.0.txt#L248
for the release note.

Added: 
    clang-tools-extra/clangd/test/.gitattributes
    clang/test/.gitattributes
    llvm/test/FileCheck/.gitattributes
    llvm/test/tools/llvm-ar/Inputs/.gitattributes
    llvm/utils/lit/tests/Inputs/shtest-shell/.gitattributes

Modified: 
    .gitattributes
    llvm/docs/TestingGuide.rst

Removed: 
    


################################################################################
diff  --git a/.gitattributes b/.gitattributes
index 6b281f33f737db..aced01d485c181 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,3 +1,10 @@
+# Checkout as native, commit as LF except in specific circumstances
+* text=auto
+*.bat text eol=crlf
+*.rc text eol=crlf
+*.sln text eol=crlf
+*.natvis text eol=crlf
+
 libcxx/src/**/*.cpp     merge=libcxx-reformat
 libcxx/include/**/*.h   merge=libcxx-reformat
 

diff  --git a/clang-tools-extra/clangd/test/.gitattributes b/clang-tools-extra/clangd/test/.gitattributes
new file mode 100644
index 00000000000000..20971adc2b5d03
--- /dev/null
+++ b/clang-tools-extra/clangd/test/.gitattributes
@@ -0,0 +1,3 @@
+input-mirror.test text eol=crlf
+too_large.test text eol=crlf
+protocol.test text eol=crlf

diff  --git a/clang/test/.gitattributes b/clang/test/.gitattributes
new file mode 100644
index 00000000000000..160fc6cf561751
--- /dev/null
+++ b/clang/test/.gitattributes
@@ -0,0 +1,4 @@
+FixIt/fixit-newline-style.c text eol=crlf
+Frontend/system-header-line-directive-ms-lineendings.c text eol=crlf
+Frontend/rewrite-includes-mixed-eol-crlf.* text eol=crlf
+clang/test/Frontend/rewrite-includes-mixed-eol-lf.h text eolf=lf

diff  --git a/llvm/docs/TestingGuide.rst b/llvm/docs/TestingGuide.rst
index 08617933519fdb..344a295226f6ae 100644
--- a/llvm/docs/TestingGuide.rst
+++ b/llvm/docs/TestingGuide.rst
@@ -360,6 +360,12 @@ Best practices for regression tests
 - Try to give values (including variables, blocks and functions) meaningful
   names, and avoid retaining complex names generated by the optimization
   pipeline (such as ``%foo.0.0.0.0.0.0``).
+- If your tests depend on specific input file encodings, beware of line-ending
+  issues across 
diff erent platforms, and in the project's history. Before you
+  commit tests that depend on explicit encodings, consider adding filetype or
+  specific line-ending annotations to a `<.gitattributes
+  https://git-scm.com/docs/gitattributes#_effects>`_ file in the appropriate
+  directory in the repository.
 
 Extra files
 -----------

diff  --git a/llvm/test/FileCheck/.gitattributes b/llvm/test/FileCheck/.gitattributes
new file mode 100644
index 00000000000000..ba27d7fad76d50
--- /dev/null
+++ b/llvm/test/FileCheck/.gitattributes
@@ -0,0 +1 @@
+dos-style-eol.txt text eol=crlf

diff  --git a/llvm/test/tools/llvm-ar/Inputs/.gitattributes b/llvm/test/tools/llvm-ar/Inputs/.gitattributes
new file mode 100644
index 00000000000000..6c8a26285daf7f
--- /dev/null
+++ b/llvm/test/tools/llvm-ar/Inputs/.gitattributes
@@ -0,0 +1 @@
+mri-crlf.mri text eol=crlf

diff  --git a/llvm/utils/lit/tests/Inputs/shtest-shell/.gitattributes b/llvm/utils/lit/tests/Inputs/shtest-shell/.gitattributes
new file mode 100644
index 00000000000000..2df17345df5b87
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/shtest-shell/.gitattributes
@@ -0,0 +1 @@
+*.dos text eol=crlf


        


More information about the cfe-commits mailing list