[llvm-dev] monorepo: bad performance when using gitk / git log
    James Y Knight via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Wed Mar 27 12:37:43 PDT 2019
    
    
  
The problem here seems to be due to the combination of specifying
--parents, and specifying a pathname to filter by. I can certainly
reproduce a _remarkable_ slowness with that combination from git....
On my machine:
$ time git log --parents --oneline origin/master > /dev/null
real    0m4.001s
$ time git log origin/master -- llvm/test/CodeGen/Generic/bswap.ll >
/dev/null
real    0m5.332s
$ time git log --parents --oneline origin/master --
llvm/test/CodeGen/Generic/bswap.ll > /dev/null
real    2m48.944s
That said, I use gitk frequently, and had not noticed performance issues.
But, I'd never tried invoking it with a path on the command-line, only with
ref names, so it's not hitting the bad case.
Nor have I noted issues with git log, but again, I'd never have run it with
--parents, so I don't hit this bad case.
Maybe worth reporting as a possible bug to git? Surely whatever algorithm
it's using shouldn't be _this_ slow.
On Wed, Mar 27, 2019 at 9:23 AM Björn Pettersson A via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi!
>
>
>
> Anyone else experiencing performance problems when using the new monorepo?
>
>
>
> My experience is that performance of gitk (and git log) sometimes is
> really bad when working in the monorepo.
>
>
>
> I’ve mainly seen it when using gitk on specific files/directories, but
> since gitk seems to be using “git log --no-color -z --pretty=raw
> --show-notes --parents --boundary HEAD -- <file>” it is possible to observe
> the same thing when using git log.
>
>
>
>
>
> The problem can be seen when creating a brand new commit (with a new file):
>
>
>
> bash-4.1$ git clone https://github.com/llvm/llvm-project.git llvm-project
>
> bash-4.1$ cd llvm-project
>
> bash-4.1$ touch dummy
>
> bash-4.1$ git add dummy
>
> bash-4.1$ git commit -m "test"
>
> [master 6539b74dd0e] test
>
> 1 file changed, 0 insertions(+), 0 deletions(-)
>
> create mode 100644 llvm/dummy
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- dummy > /dev/null
>
> 198.37user 0.40system 3:18.67elapsed 100%CPU (0avgtext+0avgdata
> 696456maxresident)k
>
> 0inputs+0outputs (0major+175765minor)pagefaults 0swaps
>
>
>
>
>
> But also when examining older files, here are some tests using the
> monorepo:
>
>
>
> bash-4.1$ git clone https://github.com/llvm/llvm-project.git llvm-project
>
> bash-4.1$ cd llvm-project
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD > /dev/null
>
> 5.15user 0.26system 0:05.42elapsed 99%CPU (0avgtext+0avgdata
> 220344maxresident)k
>
> 0inputs+0outputs (0major+56131minor)pagefaults 0swaps
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- README.md > /dev/null
>
> 155.20user 0.34system 2:35.45elapsed 100%CPU (0avgtext+0avgdata
> 636744maxresident)k
>
> 0inputs+0outputs (0major+160862minor)pagefaults 0swaps
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- llvm/CODE_OWNERS.TXT > /dev/null
>
> 55.48user 0.34system 0:55.80elapsed 100%CPU (0avgtext+0avgdata
> 690124maxresident)k
>
> 0inputs+0outputs (0major+174196minor)pagefaults 0swaps
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- llvm/test/CodeGen/Generic/bswap.ll > /dev/null
>
> 192.97user 0.33system 3:13.19elapsed 100%CPU (0avgtext+0avgdata
> 696496maxresident)k
>
> 0inputs+0outputs (0major+176003minor)pagefaults 0swaps
>
>
>
>
>
> Same tests when using the old llvm repo (there is no README.md so I
> skipped that test here):
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD > /dev/null
>
> 2.72user 0.12system 0:02.84elapsed 99%CPU (0avgtext+0avgdata
> 136628maxresident)k
>
> 0inputs+0outputs (0major+36354minor)pagefaults 0swaps
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- CODE_OWNERS.TXT > /dev/null
>
> 2.74user 0.19system 0:02.93elapsed 99%CPU (0avgtext+0avgdata
> 344756maxresident)k
>
> 0inputs+0outputs (0major+88975minor)pagefaults 0swaps
>
>
>
> bash-4.1$ /usr/bin/time git log --no-color -z --pretty=raw --show-notes
> --parents --boundary HEAD  -- test/CodeGen/Generic/bswap.ll > /dev/null
>
> 3.76user 0.19system 0:03.96elapsed 99%CPU (0avgtext+0avgdata
> 380416maxresident)k
>
> 0inputs+0outputs (0major+98218minor)pagefaults 0swaps
>
>
>
>
>
> The example with test/CodeGen/Generic/bswap.ll  indicates that it can take
> 193/4=48 times longer time to open gitk (or run git log) on a file when
> using the monorepo(!?!?).
>
>
>
> I’m not so familiar with the inner details of git. Could this be a bad
> repack of the llvm-projects repo or something?
>
> Or is it just that we now squeeze so many commits into the same repo that
> I should expect the performance to be even worse in the future?
>
>
>
> The figures above is when using git 2.14.1, but I’ve also tried 2.20.0
> with similar results.
>
>
>
> Regards,
>
> Björn
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190327/a5497eb9/attachment.html>
    
    
More information about the llvm-dev
mailing list