[llvm] r329648 - [CachePruning] Fix comment about ext4 per-directory file limit. NFC

Peter Collingbourne via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 11 10:45:35 PDT 2018


Okay. So can you please update the comment then?

Peter

On Wed, Apr 11, 2018 at 10:36 AM, Fāng-ruì Sòng <maskray at google.com> wrote:

> truncate -s 100G 100G; mkfs.ext4 -N 30000000 100G
> mkdir ext4; mount 100G ext4
>
> For a directory using htree without turning on large_dir:
>
> 508   root node entries (root_limit)
> 510   internal node entries (node_limit)
>
> For a filename with 40 bytes, its sizeof(ext4_dir_entry_2) = 48, a linear
> directory (a leaf node of the htree) can contain at most floor(4096/48)=85
> of them.
> The real per-directory entry limit should be 508*510*85 = 22021800
> The limit varies with the average length of filenames.
>
> However, the code does not try rebalancing the htree, so we will not be
> able to create filenames in a full leaf node. This is demonstrated with the
> following example, certain filenames cannot be used while others can:
>
> % touch d/0000000000000000000000000000000000816a6f
> touch: cannot touch 'd/0000000000000000000000000000000000816a6f': No
> space left on device
> % touch d/0000000000000000000000000000000000816a70
> # succeeded
>
> Another issue is that the size used by htree cannot shrink. I believe we
> can exceed that limit if we use latest mkfs.ext4 with -O large_dir, see
> http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.44.0
>
> On 2018-04-10, Peter Collingbourne wrote:
>
>> Here is what I did.
>>
>> $ mkfs.ext4 -N 200000000 100G
>> $ sudo mount /path/to/100G d3
>> $ ./make
>> [...]
>> d3/d/000000000000000000000000900000
>> d3/d/000000000000000000000000910000
>> d3/d/000000000000000000000000920000
>> d3/d/000000000000000000000000930000
>> d3/d/000000000000000000000000940000
>> d3/d/000000000000000000000000950000
>> d3/d/000000000000000000000000960000
>> d3/d/000000000000000000000000970000
>> d3/d/000000000000000000000000980000
>> d3/d/000000000000000000000000990000
>> d3/d/0000000000000000000000009a0000
>> d3/d/0000000000000000000000009b0000
>> open: No space left on device
>> $ vi make.c
>> [ changed %030lx to %029lx ]
>> $ make make
>> $ ./make
>> d3/d/00000000000000000000000000000
>> open: No space left on device
>> $ vi make.c
>> [ changed d3/d to d3/d4 ]
>> $ make make
>> $ ./make
>> d3/d4/00000000000000000000000000000
>> d3/d4/00000000000000000000000010000
>> d3/d4/00000000000000000000000020000
>> d3/d4/00000000000000000000000030000
>> d3/d4/00000000000000000000000040000
>> d3/d4/00000000000000000000000050000
>> d3/d4/00000000000000000000000060000
>> d3/d4/00000000000000000000000070000
>> d3/d4/00000000000000000000000080000
>> ^C
>>
>> I found this in dmesg which seemed relevant:
>>
>> [3552607.241532] EXT4-fs warning (device loop0): ext4_dx_add_entry:2228:
>> inode
>> #79051521: comm make: Directory index full!
>>
>> Searching the kernel source code for that error message led me to this
>> commit:
>> https://github.com/torvalds/linux/commit/
>> e08ac99fa2a25626f573cfa377ef3ddedf2cfe8f
>> So there does indeed appear to be a default limit of 10M entries per
>> directory.
>>
>> Peter
>>
>> On Tue, Apr 10, 2018 at 4:14 PM, Fāng-ruì Sòng <maskray at google.com>
>> wrote:
>>
>>    You may create a sparse filesystem in tmpfs as I did in a previous
>> reply.
>>
>>    truncate -s 100G 100G
>>    mkfs.ext4 100G
>>
>>    I am pretty sure you have hit some other limit (and in your case
>> likely the
>>    disk space as you only got 4% left), rather than the hypothetical
>>    "per-directory file limit". Does `dmesg` report anything suspicious?
>>
>>    On Tue, Apr 10, 2018 at 3:55 PM Peter Collingbourne <peter at pcc.me.uk>
>>    wrote:
>>
>>        On Tue, Apr 10, 2018 at 3:23 PM, Fangrui Song <maskray at google.com>
>>        wrote:
>>
>>            On 2018-04-10, Peter Collingbourne wrote:
>>                          As I said that wasn't the limit I was hitting.
>> Here is the
>>                program that I
>>                wrote:
>>
>>                #include <stdint.h>
>>                #include <errno.h>
>>                #include <stdio.h>
>>                #include <sys/types.h>
>>                #include <sys/stat.h>
>>                #include <fcntl.h>
>>                #include <unistd.h>
>>                #include <string.h>
>>
>>                int main() {
>>                  for (uint64_t i = 0;; ++i) {
>>                    char buf[256];
>>                    snprintf(buf, 256, "d/%032lx", i);
>>                    if (i % 65536 == 0)
>>                      puts(buf);
>>                    int fd = open(buf, O_CREAT);
>>                    if (fd == -1) {
>>                      printf("open: %s\n", strerror(errno));
>>                      return 1;
>>                    }
>>                    close(fd);
>>                  }
>>                }
>>
>>                The output ends in:
>>
>>                d/00000000000000000000000000980000
>>                d/00000000000000000000000000990000
>>                d/000000000000000000000000009a0000
>>                d/000000000000000000000000009b0000
>>                d/000000000000000000000000009c0000
>>                d/000000000000000000000000009d0000
>>                d/000000000000000000000000009e0000
>>                d/000000000000000000000000009f0000
>>                open: No space left on device
>>
>>                df:
>>                Filesystem
>>                1K-blocks      Used
>>                Available Use% Mounted on
>>                [redacted] 856678580 780069796  33068920  96% [redacted]
>>
>>                df -i:
>>                Filesystem
>>                Inodes    IUsed
>>                IFree IUse% Mounted on
>>                [redacted] 54411264 18824333 35586931   35% [redacted]
>>
>>                Peter
>>
>>            I suspect your case triggered a hashed btree split that would
>>            consume
>>            more disk space. Can you try again on a newly created ext4
>>            filesystem
>>            (ensuring it has sufficient space left).
>>
>>            Your example works well on my machine: I cannot created files
>> in
>>            other directories as well. dumpe2fs tells me the inodes are
>> used
>>            up.
>>
>>        I don't have enough disk space on my machine right now. Maybe you
>> can
>>        try creating the file system with -N (some large number)?
>>
>>        Peter
>>
>>                On Tue, Apr 10, 2018 at 1:40 PM, Fangrui Song <
>>                maskray at google.com> wrote:
>>
>>                   su
>>                   truncate -s 100G 100G
>>                   mkfs.ext4 100G
>>                   mkdir ext4
>>                   mount 100G ext4
>>                   cd ext4
>>
>>                   mkdir p
>>                   cd p
>>                   python3 -c 'for i in range(6600000):\n with open(str
>>                (i),"w"): pass'
>>
>>                   It runs out of inodes with some message like:
>>
>>                   OSError: [Errno 28] No space left on device: '6553587'
>>
>>                   umount ext4; dumpe2fs 100G # says the inodes are used up
>>                   ...
>>                   Free inodes:              0
>>                   ...
>>
>>                   On 2018-04-10, Peter Collingbourne wrote:
>>
>>                       No, these were empty files. It wasn't an inode limit
>>                because I could
>>                       still
>>                       create files in other directories.
>>
>>                       Peter
>>
>>                       On Tue, Apr 10, 2018 at 10:35 AM, Fangrui Song <
>>                maskray at google.com>
>>                       wrote:
>>
>>                          On 2018-04-09, Peter Collingbourne wrote:
>>
>>                              Are you sure about that? I'm pretty sure that
>>                before writing
>>                       that
>>                              comment I
>>                              wrote a small program that created lots of
>> files
>>                (not
>>                       subdirectories)
>>                              in a
>>                              directory until it started getting error
>>                messages, which started
>>                              happening at
>>                              around 6000000 files.
>>
>>                              Peter
>>
>>                          I guess you created a file of 100GiB. The number
>> of
>>                inodes is
>>                       roughly
>>                          6553600.
>>
>>                          100*1024*1024*1024 / 16384 = 6553600.0 where
>> 16384 is
>>                the default
>>                          bytes-per-inode (man mke2fs).
>>
>>                          % truncate -s 100G 100G
>>                          % mkfs.ext4 100G
>>                          % dumpe2fs 100G
>>                          .....
>>                          Inode count:              6553600
>>                          .....
>>
>>                          Each file consumes one inode and the number of
>> files
>>                in that
>>                       directory
>>                          is limited by this factor.
>>
>>                              On Mon, Apr 9, 2018 at 5:12 PM, Fangrui Song
>> via
>>                llvm-commits <
>>                              llvm-commits at lists.llvm.org> wrote:
>>
>>                                 Author: maskray
>>                                 Date: Mon Apr  9 17:12:28 2018
>>                                 New Revision: 329648
>>
>>                                 URL: http://llvm.org/viewvc/llvm-pr
>> oject?rev=
>>                329648&view=rev
>>                                 Log:
>>                                 [CachePruning] Fix comment about ext4
>>                per-directory file
>>                       limit. NFC
>>
>>                                 There is a limit on number of
>> subdirectories
>>                if dir_nlinks is
>>                       not
>>                                 enabled (31998), but per-directory number
>> of
>>                files is not
>>                       limited.
>>
>>                                 Modified:
>>                                     llvm/trunk/include/llvm/Support/
>>                CachePruning.h
>>
>>                                 Modified: llvm/trunk/include/llvm/Suppor
>> t/
>>                CachePruning.h
>>                                 URL: http://llvm.org/viewvc/llvm-pr
>> oject/llvm/
>>                trunk/include/
>>                       llvm/
>>                              Support/
>>                                 CachePruning.h?rev=329648&r1=
>> 329647&r2=329648&
>>                view=diff
>>                                 =============================
>> =================
>>                ===============
>>                       =======
>>                              =======
>>                                 ===
>>                                 --- llvm/trunk/include/llvm/Support/
>>                CachePruning.h (original)
>>                                 +++ llvm/trunk/include/llvm/Support/
>>                CachePruning.h Mon Apr  9
>>                              17:12:28 2018
>>                                 @@ -52,9 +52,8 @@ struct
>> CachePruningPolicy {
>>                                    /// the number of files based pruning.
>>                                    ///
>>                                    /// This defaults to 1000000 because
>> with
>>                that many files
>>                       there
>>                              are
>>                                 -  /// diminishing returns on the
>>                effectiveness of the cache,
>>                       and
>>                              some file
>>                                 -  /// systems have a limit on how many
>> files
>>                can be
>>                       contained in a
>>                                 directory
>>                                 -  /// (notably ext4, which is limited to
>>                around 6000000
>>                       files).
>>                                 +  /// diminishing returns on the
>>                effectiveness of the cache,
>>                       and
>>                              file
>>                                 +  /// systems have a limit on total
>> number of
>>                files.
>>                                    uint64_t MaxSizeFiles = 1000000;
>>                                  };
>>
>>                                 _____________________________
>>                __________________
>>                                 llvm-commits mailing list
>>                                 llvm-commits at lists.llvm.org
>>                                 http://lists.llvm.org/cgi-bin
>> /mailman/listinfo
>>                /llvm-commits
>>
>>                              --
>>                              --
>>                              Peter
>>
>>                          --
>>                          宋方睿
>>
>>                       --
>>                       --
>>                       Peter
>>
>>                   --
>>                   宋方睿
>>
>>                --
>>                --
>>                Peter
>>
>>            --
>>            宋方睿
>>
>>        --
>>        --
>>        Peter
>>
>>    --
>>    宋方睿
>>
>> --
>> --
>> Peter
>>
>
> --
> 宋方睿
>



-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180411/81cb9947/attachment.html>


More information about the llvm-commits mailing list