This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Slow stat(2) performance on ClearCase MVFS
- From: Earl Chew <earl_chew at agilent dot com>
- To: cygwin-developers at cygwin dot com
- Date: Sat, 18 Apr 2009 12:03:26 -0700
- Subject: Slow stat(2) performance on ClearCase MVFS
For quite a while now I've seen noticeably poor cygwin performance
on ClearCase MVFS drives with recursive commands like:
o grep -r
o find .
o rm -r
For example, executing 'find . -name "*.exe"' on a particular
MVFS directory tree here takes 8 mins (480 secs), but using
the strategy outlined in result 6 below reduces the time to 32 secs.
Some digging on 1.5.25-15 and narrowed down the issue to the
performance of stat(2).
Some questions:
o Does it make sense to replace GetFileAttributes() with
FindFirstFile() in all cases ?
o Is it possible for fhandler_base::fstat_fs() to always
use fstat_by_name() only, and avoid using open_fs() and
fstat_by_handle() ?
Here are timings using some simple benchmarking programs. Each
program has a simple 10000 iteration loop:
GetFileAttributes Perform GetFileAttributes(argv[1])
FindFirstFile Perform FindFirstFile(argv[1]), FindClose()
stat Perform stat(argv[1])
The results are measured in elapsed seconds using cygwin time(1)
on the following files:
NTFS c:/WINDOWS/system32/drivers/etc/hosts
MVFS v:/cerberus/daytona/lib/Makefile.mk
NTFS MVFS
1. GetFileAttributes 0.66 10.5
2. FindFirstFile 0.33 1.2
3. stat(MSVC) 0.37 1.2
4. stat(CYGWIN-1.5.25) 1.47 20.3
5. stat(no open) 2.4 11.5
6. stat(no attr, open) 2.0 2.3
Results 2 and 3 show that Win32 and MSVC functions perform
well, but that we can expect that ClearCase MVFS is four times
slower than a native NTFS.
Result 1 shows that GetFileAttributes is nearly ten times
slower than FindFirstFile for MVFS, and twice as slow for NTFS.
Result 4 gives a baseline performance for stat(2) on a vanilla
1.5.25-15 system.
Result 5 shows a doubling of MVFS performance over result 4 by forcing
fstat_by_name() instead of fstat_by_handle():
--- fhandler_disk_file.cc.orig 2009-04-18 10:26:34.937500000 -0700
+++ fhandler_disk_file.cc 2009-04-18 10:27:04.484375000 -0700
@@ -356,7 +356,7 @@
return fstat_by_name (buf);
query_open (query_stat_control);
}
- if (!(oret = open_fs (open_flags, 0)) && get_errno () == EACCES)
+ if ((oret = 0) && !(oret = open_fs (open_flags, 0)) && get_errno ()
== EACCES
)
Result 6 shows a ten times improvement in MVFS performance over
result 4 by forcing fstat_by_name() and also forcing the use
of GetFileAttributes():
--- path.cc.orig 2009-04-18 11:18:49.812500000 -0700
+++ path.cc 2009-04-18 11:19:01.625000000 -0700
@@ -4299,3 +4299,24 @@
strcpy (bs, ".");
return buf;
}
+
+extern "C"
+DWORD GetFileAttributes (const TCHAR* path)
+{
+ for (const TCHAR* p = path; *p; ++p)
+ if (*p == '*' || *p == '?')
+ return INVALID_FILE_ATTRIBUTES;
+
+ WIN32_FIND_DATA findbuf;
+
+ HANDLE findhandle = FindFirstFile(path, &findbuf);
+
+ if (findhandle != INVALID_HANDLE_VALUE)
+ {
+ FindClose(findhandle);
+
+ return findbuf.dwFileAttributes;
+ }
+
+ return INVALID_FILE_ATTRIBUTES;
+}