This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Cygwin multithreading performance


Kacper Michajlow wrote:
Thanks for reply. And sorry for being not specific enough before. 'git
gc' is a driver which runs various git command to do cleanup in
repository. Though I'm mostly concerned about the code I linked.
Instead of 'git gc' it is better to test directly 'git repack -a -f'
and possibly on repository where it takes some time.
'git://sourceware.org/git/newlib-cygwin.git' is good test case.
Although with bigger repositories performance hit is bigger, this is
good example to see what's going on.

I appreciate that more specific info on how you experience the issue.

I'm well aware that forking on windows is problematic, but I
explicitly interested in parallelized part of execution. I don't care
about forks, while this slows things down too, they are not used in
compression process which is parallelized over the all cpu threads.
Each command is indeed forked, but I'm only interested about
pack-objects part hence the code I linked.

OK, we're on the same page now :).

$ strace --mask=debug+syscall+thread -o git.strace git repack -a -f
Counting objects: 156690, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (154730/154730), done.
Writing objects: 100% (156690/156690), done.
Total 156690 (delta 123449), reused 33146 (delta 0)

$ grep "fork(" git.strace
   559   53728 [main] git 24340 fork: 24368 = fork()
   465   54022 [main] git 24368 fork: 0 = fork()

Only two forks were created, while during compression only 25% cpu was
used (on big repo like linux kernel it doesn't exceed 8%). With native
git the same workload easily uses 95-100% cpu and therefor is a lot
faster.

I was able to reproduce your issue using a cloned newlib-cygwin repo. On a 6-CPU machine I saw max 36% CPU utilization during the compression phase. ProcessExplorer showed all 6 threads were getting CPU time (to varying degrees) and when suspended they were always trying to acquire a mutex. I'd like to run some more straces and perhaps investigate with some other tools before saying more. This may take a while.

What I've done so far is install the git-debuginfo and cygwin-debuginfo packages to that I can convert hex RIP addresses to line numbers. I've run the testcase under gdb so I can interrupt at random times and poke around. The straces from this testcase are ginormous so I hope I can figure out a better way to see why the compression threads aren't CPU-bound like they should be. If you don't already know, 'strace --help' shows the available mask values. The threads are each writing to disk, so I wonder if there's some unintentional serialization going on somewhere, but I don't know yet how I could verify that theory.

..mark


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]