<div dir="ltr">Okay. So can you please update the comment then?<div><br></div><div>Peter</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Apr 11, 2018 at 10:36 AM, Fāng-ruì Sòng <span dir="ltr"><<a href="mailto:maskray@google.com" target="_blank">maskray@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">truncate -s 100G 100G; mkfs.ext4 -N 30000000 100G<span class=""><br>
mkdir ext4; mount 100G ext4<br>
<br></span>
For a directory using htree without turning on large_dir:<br>
<br>
508   root node entries (root_limit)<br>
510   internal node entries (node_limit)<br>
<br>
For a filename with 40 bytes, its sizeof(ext4_dir_entry_2) = 48, a linear directory (a leaf node of the htree) can contain at most floor(4096/48)=85 of them.<br>
The real per-directory entry limit should be 508*510*85 = 22021800<br>
The limit varies with the average length of filenames.<br>
<br>
However, the code does not try rebalancing the htree, so we will not be able to create filenames in a full leaf node. This is demonstrated with the following example, certain filenames cannot be used while others can:<br>
<br>
% touch d/0000000000000000000000000000<wbr>000000816a6f<br>
touch: cannot touch 'd/000000000000000000000000000<wbr>0000000816a6f': No space left on device<br>
% touch d/0000000000000000000000000000<wbr>000000816a70<br>
# succeeded<br>
<br>
Another issue is that the size used by htree cannot shrink. I believe we can exceed that limit if we use latest mkfs.ext4 with -O large_dir, see <a href="http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.44.0" rel="noreferrer" target="_blank">http://e2fsprogs.sourceforge.n<wbr>et/e2fsprogs-release.html#1.44<wbr>.0</a><span class="im HOEnZb"><br>
<br>
On 2018-04-10, Peter Collingbourne wrote:<br>
</span><div class="HOEnZb"><div class="h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Here is what I did.<br>
<br>
$ mkfs.ext4 -N 200000000 100G<br>
$ sudo mount /path/to/100G d3<br>
$ ./make<br>
[...]<br>
d3/d/0000000000000000000000009<wbr>00000<br>
d3/d/0000000000000000000000009<wbr>10000<br>
d3/d/0000000000000000000000009<wbr>20000<br>
d3/d/0000000000000000000000009<wbr>30000<br>
d3/d/0000000000000000000000009<wbr>40000<br>
d3/d/0000000000000000000000009<wbr>50000<br>
d3/d/0000000000000000000000009<wbr>60000<br>
d3/d/0000000000000000000000009<wbr>70000<br>
d3/d/0000000000000000000000009<wbr>80000<br>
d3/d/0000000000000000000000009<wbr>90000<br>
d3/d/0000000000000000000000009<wbr>a0000<br>
d3/d/0000000000000000000000009<wbr>b0000<br>
open: No space left on device<br>
$ vi make.c<br>
[ changed %030lx to %029lx ]<br>
$ make make<br>
$ ./make<br>
d3/d/0000000000000000000000000<wbr>0000<br>
open: No space left on device<br>
$ vi make.c<br>
[ changed d3/d to d3/d4 ]<br>
$ make make<br>
$ ./make <br>
d3/d4/000000000000000000000000<wbr>00000<br>
d3/d4/000000000000000000000000<wbr>10000<br>
d3/d4/000000000000000000000000<wbr>20000<br>
d3/d4/000000000000000000000000<wbr>30000<br>
d3/d4/000000000000000000000000<wbr>40000<br>
d3/d4/000000000000000000000000<wbr>50000<br>
d3/d4/000000000000000000000000<wbr>60000<br>
d3/d4/000000000000000000000000<wbr>70000<br>
d3/d4/000000000000000000000000<wbr>80000<br>
^C<br>
<br>
I found this in dmesg which seemed relevant:<br>
<br>
[3552607.241532] EXT4-fs warning (device loop0): ext4_dx_add_entry:2228: inode<br>
#79051521: comm make: Directory index full!<br>
<br>
Searching the kernel source code for that error message led me to this commit:<br>
<a href="https://github.com/torvalds/linux/commit/" rel="noreferrer" target="_blank">https://github.com/torvalds/li<wbr>nux/commit/</a><br>
e08ac99fa2a25626f573cfa377ef3d<wbr>dedf2cfe8f<br>
So there does indeed appear to be a default limit of 10M entries per directory.<br>
<br>
Peter<br>
<br>
On Tue, Apr 10, 2018 at 4:14 PM, Fāng-ruì Sòng <<a href="mailto:maskray@google.com" target="_blank">maskray@google.com</a>> wrote:<br>
<br>
   You may create a sparse filesystem in tmpfs as I did in a previous reply.<br>
<br>
   truncate -s 100G 100G<br>
   mkfs.ext4 100G<br>
<br>
   I am pretty sure you have hit some other limit (and in your case likely the<br>
   disk space as you only got 4% left), rather than the hypothetical<br>
   "per-directory file limit". Does `dmesg` report anything suspicious?<br>
<br>
   On Tue, Apr 10, 2018 at 3:55 PM Peter Collingbourne <<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>><br>
   wrote:<br>
<br>
       On Tue, Apr 10, 2018 at 3:23 PM, Fangrui Song <<a href="mailto:maskray@google.com" target="_blank">maskray@google.com</a>><br>
       wrote:<br>
<br>
           On 2018-04-10, Peter Collingbourne wrote:<br>
                         As I said that wasn't the limit I was hitting. Here is the<br>
               program that I<br>
               wrote:<br>
<br>
               #include <stdint.h><br>
               #include <errno.h><br>
               #include <stdio.h><br>
               #include <sys/types.h><br>
               #include <sys/stat.h><br>
               #include <fcntl.h><br>
               #include <unistd.h><br>
               #include <string.h><br>
<br>
               int main() {<br>
                 for (uint64_t i = 0;; ++i) {<br>
                   char buf[256];<br>
                   snprintf(buf, 256, "d/%032lx", i);<br>
                   if (i % 65536 == 0)<br>
                     puts(buf);<br>
                   int fd = open(buf, O_CREAT);<br>
                   if (fd == -1) {<br>
                     printf("open: %s\n", strerror(errno));<br>
                     return 1;<br>
                   }<br>
                   close(fd);<br>
                 }<br>
               }<br>
<br>
               The output ends in:<br>
<br>
               d/000000000000000000000000009<wbr>80000<br>
               d/000000000000000000000000009<wbr>90000<br>
               d/000000000000000000000000009<wbr>a0000<br>
               d/000000000000000000000000009<wbr>b0000<br>
               d/000000000000000000000000009<wbr>c0000<br>
               d/000000000000000000000000009<wbr>d0000<br>
               d/000000000000000000000000009<wbr>e0000<br>
               d/000000000000000000000000009<wbr>f0000<br>
               open: No space left on device<br>
<br>
               df:<br>
               Filesystem                                             <br>
               1K-blocks      Used<br>
               Available Use% Mounted on<br>
               [redacted] 856678580 780069796  33068920  96% [redacted]<br>
<br>
               df -i:<br>
               Filesystem                                               <br>
               Inodes    IUsed   <br>
               IFree IUse% Mounted on<br>
               [redacted] 54411264 18824333 35586931   35% [redacted]<br>
<br>
               Peter<br>
<br>
           I suspect your case triggered a hashed btree split that would<br>
           consume<br>
           more disk space. Can you try again on a newly created ext4<br>
           filesystem<br>
           (ensuring it has sufficient space left).<br>
<br>
           Your example works well on my machine: I cannot created files in<br>
           other directories as well. dumpe2fs tells me the inodes are used<br>
           up.<br>
<br>
       I don't have enough disk space on my machine right now. Maybe you can<br>
       try creating the file system with -N (some large number)?<br>
<br>
       Peter<br>
<br>
               On Tue, Apr 10, 2018 at 1:40 PM, Fangrui Song <<br>
               <a href="mailto:maskray@google.com" target="_blank">maskray@google.com</a>> wrote:<br>
<br>
                  su<br>
                  truncate -s 100G 100G<br>
                  mkfs.ext4 100G<br>
                  mkdir ext4<br>
                  mount 100G ext4<br>
                  cd ext4<br>
<br>
                  mkdir p<br>
                  cd p<br>
                  python3 -c 'for i in range(6600000):\n with open(str<br>
               (i),"w"): pass'<br>
<br>
                  It runs out of inodes with some message like:<br>
<br>
                  OSError: [Errno 28] No space left on device: '6553587'<br>
<br>
                  umount ext4; dumpe2fs 100G # says the inodes are used up<br>
                  ...<br>
                  Free inodes:              0<br>
                  ...<br>
<br>
                  On 2018-04-10, Peter Collingbourne wrote:<br>
<br>
                      No, these were empty files. It wasn't an inode limit<br>
               because I could<br>
                      still<br>
                      create files in other directories.<br>
<br>
                      Peter<br>
<br>
                      On Tue, Apr 10, 2018 at 10:35 AM, Fangrui Song <<br>
               <a href="mailto:maskray@google.com" target="_blank">maskray@google.com</a>><br>
                      wrote:<br>
<br>
                         On 2018-04-09, Peter Collingbourne wrote:<br>
<br>
                             Are you sure about that? I'm pretty sure that<br>
               before writing<br>
                      that<br>
                             comment I<br>
                             wrote a small program that created lots of files<br>
               (not<br>
                      subdirectories)<br>
                             in a<br>
                             directory until it started getting error<br>
               messages, which started<br>
                             happening at<br>
                             around 6000000 files.<br>
<br>
                             Peter<br>
<br>
                         I guess you created a file of 100GiB. The number of<br>
               inodes is<br>
                      roughly<br>
                         6553600.<br>
<br>
                         100*1024*1024*1024 / 16384 = 6553600.0 where 16384 is<br>
               the default<br>
                         bytes-per-inode (man mke2fs).<br>
<br>
                         % truncate -s 100G 100G<br>
                         % mkfs.ext4 100G<br>
                         % dumpe2fs 100G<br>
                         .....<br>
                         Inode count:              6553600<br>
                         .....<br>
<br>
                         Each file consumes one inode and the number of files<br>
               in that<br>
                      directory<br>
                         is limited by this factor.<br>
<br>
                             On Mon, Apr 9, 2018 at 5:12 PM, Fangrui Song via<br>
               llvm-commits <<br>
                             <a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>> wrote:<br>
<br>
                                Author: maskray<br>
                                Date: Mon Apr  9 17:12:28 2018<br>
                                New Revision: 329648<br>
<br>
                                URL: <a href="http://llvm.org/viewvc/llvm-project?rev=" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject?rev=</a><br>
               329648&view=rev<br>
                                Log:<br>
                                [CachePruning] Fix comment about ext4<br>
               per-directory file<br>
                      limit. NFC<br>
<br>
                                There is a limit on number of subdirectories<br>
               if dir_nlinks is<br>
                      not<br>
                                enabled (31998), but per-directory number of<br>
               files is not<br>
                      limited.<br>
<br>
                                Modified:<br>
                                    llvm/trunk/include/llvm/Suppor<wbr>t/<br>
               CachePruning.h<br>
<br>
                                Modified: llvm/trunk/include/llvm/Suppor<wbr>t/<br>
               CachePruning.h<br>
                                URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/</a><br>
               trunk/include/<br>
                      llvm/<br>
                             Support/<br>
                                CachePruning.h?rev=329648&r1=<wbr>329647&r2=329648&<br>
               view=diff<br>
                                =============================<wbr>=================<br>
               ===============<br>
                      =======<br>
                             =======<br>
                                ===<br>
                                --- llvm/trunk/include/llvm/Suppor<wbr>t/<br>
               CachePruning.h (original)<br>
                                +++ llvm/trunk/include/llvm/Suppor<wbr>t/<br>
               CachePruning.h Mon Apr  9<br>
                             17:12:28 2018<br>
                                @@ -52,9 +52,8 @@ struct CachePruningPolicy {<br>
                                   /// the number of files based pruning.<br>
                                   ///<br>
                                   /// This defaults to 1000000 because with<br>
               that many files<br>
                      there<br>
                             are<br>
                                -  /// diminishing returns on the<br>
               effectiveness of the cache,<br>
                      and<br>
                             some file<br>
                                -  /// systems have a limit on how many files<br>
               can be<br>
                      contained in a<br>
                                directory<br>
                                -  /// (notably ext4, which is limited to<br>
               around 6000000<br>
                      files).<br>
                                +  /// diminishing returns on the<br>
               effectiveness of the cache,<br>
                      and<br>
                             file<br>
                                +  /// systems have a limit on total number of<br>
               files.<br>
                                   uint64_t MaxSizeFiles = 1000000;<br>
                                 };<br>
<br>
                                _____________________________<br>
               __________________<br>
                                llvm-commits mailing list<br>
                                <a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
                                <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin<wbr>/mailman/listinfo</a><br>
               /llvm-commits<br>
<br>
                             --<br>
                             -- <br>
                             Peter<br>
<br>
                         --<br>
                         宋方睿<br>
<br>
                      --<br>
                      -- <br>
                      Peter<br>
<br>
                  --<br>
                  宋方睿<br>
<br>
               --<br>
               -- <br>
               Peter<br>
<br>
           --<br>
           宋方睿<br>
      <br>
       --<br>
       -- <br>
       Peter<br>
<br>
   --<br>
   宋方睿<br>
<br>
--<br>
-- <br>
Peter<br>
</blockquote>
<br></div></div><span class="HOEnZb"><font color="#888888">
-- <br>
宋方睿<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">-- <div>Peter</div></div></div>
</div>