This is the mail archive of the
cygwin-patches
mailing list for the Cygwin project.
Re: memmem issues
- From: Eric Blake <ebb9 at byu dot net>
- To: cygwin-patches at cygwin dot com
- Date: Wed, 19 Dec 2007 21:01:17 -0700
- Subject: Re: memmem issues
- References: <loom.20071219T210928-910@post.gmane.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 12/19/2007 2:22 PM:
> memmem isn't standardized, but even so, it has some bugs based on comparison
> with strstr.
>
> memmem(haystack,len,"",0) should return haystack, not NULL. This should be an
> easy one-line fix.
>
> memmem currently has O(m*n) worst-case complexity (quadratic, when haystack and
> needle are both long). But with the Knuth-Morris-Pratt algorithm and a memory
> allocation of size n, this could be trimmed to O(m+n) (linear).
Here's a patch:
2007-12-20 Eric Blake <ebb9@byu.net>
* libc/memmem.cc (memmem): Fix bug when searching for empty
string. Document O(n^2) bug.
- --
Don't work too hard, make some time for fun as well!
Eric Blake ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHaekM84KuGfSFAYARArgXAJ9lz/Rk3GnD4i5q4Jy557TzVQoGoQCfb61v
4TUb18LiAQIdCasBx9nwFdw=
=5Vbo
-----END PGP SIGNATURE-----
Index: libc/memmem.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/libc/memmem.cc,v
retrieving revision 1.1
diff -u -p -r1.1 memmem.cc
--- libc/memmem.cc 8 Nov 2005 22:08:39 -0000 1.1
+++ libc/memmem.cc 20 Dec 2007 03:56:35 -0000
@@ -45,8 +45,8 @@ memmem (const void *l, size_t l_len,
const char *cs = (const char *)s;
/* we need something to compare */
- if (l_len == 0 || s_len == 0)
- return NULL;
+ if (s_len == 0)
+ return l;
/* "s" must be smaller or equal to "l" */
if (l_len < s_len)
@@ -59,6 +59,11 @@ memmem (const void *l, size_t l_len,
/* the last position where its possible to find "s" in "l" */
last = (char *) cl + l_len - s_len;
+ /* FIXME - this algorithm is worst-case O(l_len*s_len), but using
+ Knuth-Morris-Pratt would be O(l_len+s_len) at the expense of a
+ memory allocation of s_len bytes. Consider rewriting this to
+ swap over the KMP if the first few iterations fail, and back to
+ this if KMP can't allocate enough memory. */
for (cur = (char *) cl; cur <= last; cur++)
if (cur[0] == cs[0] && memcmp (cur, cs, s_len) == 0)
return cur;