This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: slow find . -type d


Pavol, Larry,

My guess (yeah, I know) would be the "stat" call. Find without any file 
type-specific constraints like "-type d" or "-type f" only has to read 
directory entries. Nor do constraints such as "-name nameGlob" or even 
"-regex nameRegex" necessitate a stat call on every directory entry found, 
but "-type typeCode" does.

Stat in Cygwin is not blinding, presumably (yeah, I know) in part because 
of the multiple native system calls needed to get all the file information 
need to construct the POSIX-style "inode" information that you get in a 
single system call in a system with a native stat(2) call.

The worst part seems to be the need to read the first couple of bytes of 
the file to look for a "#!" header in order to synthesize an execute bit. 
If my understanding is correct, this only happens on FAT file sytem volumes 
and on NTFS when the CYGWIN variable does not include "ntsec".

The odd thing is that on my system, comparing a FAT32 volume with an NTFS 
volume (on identical disk drives attached via the same SCSI adaptor) 
containing two identical hierarchies (a directory hiearchy that contains 
207 files in 10 directories--admittedly not big enough to make a very good 
test), the FAT32 volume consistently outperforms the same hierarchy on an 
NTFS volume by a small margin. Based on the "Size on disk" reported in the 
Properties dialog in Windows Explorer, both volumes are using the same 
allocation granularity (though the FAT32 volume is much smaller). Since I 
made the FAT32 copy in a formerly empty volume (that I use as a staging 
area for burning CDs), there is presumably more fragmentation in the NTFS 
volume, which gets heavy use and hasn't been de-fragmented for about a month.

Performance analysis is never simple.

Anyway, I doubt there's a bug afoot here nor is it likely there's any 
culpably poor programming involved. It's probably not going to be easy ever 
to match the native ("dir /ad/b/s") command's performance.


By the way, the non-built-in time (i.e., /bin/time) routinely shows CPU 
utilization anywhere from 101% to 116% (as well as some more reasonably 
numbers like 96%).


Randall Schulz
Mountain View, CA USA


At 20:22 2002-03-23, Larry Hall (RFK Partners, Inc) wrote:
>At 11:08 PM 3/23/2002, Pavol Juhas wrote:
> >Hello,
> >
> >I am using find 4.1 under Cygwin 1.3.10 / WinXP.  I have observed that
> >     find . -type d
> >is about 3 times slower than find . and about 100 times slower than
> >     cmd.exe /C dir /B/S/AD
> >
> >$ time find . -type d > /dev/null
> >   0.39s user 1.12s system 93% cpu 1.617 total
> >$ time find . > /dev/null
> >   0.20s user 0.47s system 105% cpu 0.632 total
> >$ time CMD.EXE /c dir /ad/b/s  > /dev/null
> >   0.01s user 0.01s system 48% cpu 0.062 total
> >
> >Any ideas what is going on?
>
>
>Not specifically, no.  But there hasn't been allot of general performance
>analysis done on Cygwin.  If you can localize the area of Cygwin which is
>causing much of the delay, I'm sure the list will be interested in the
>results and even more so in a patch! ;-)
>
>
>
>Larry Hall                              lhall@rfk.com
>RFK Partners, Inc.                      http://www.rfk.com
>838 Washington Street                   (508) 893-9779 - RFK Office
>Holliston, MA 01746                     (508) 893-9889 - FAX
>
>
>--
>Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>Bug reporting:         http://cygwin.com/bugs.html
>Documentation:         http://cygwin.com/docs.html
>FAQ:                   http://cygwin.com/faq/


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]