This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: key64_t? ino64_t?

From: Robert Collins <rbcollins at cygwin dot com>
To: Charles Wilson <cygwin at cwilson dot fastmail dot fm>
Cc: cygwin-developers at cygwin dot com
Date: 15 May 2003 09:41:50 +1000
Subject: Re: key64_t? ino64_t?
Organization:
References: <20030511035035.73D475C400@smtp.us2.messagingengine.com> <1052626034.907.67.camel@localhost> <20030514063644.A48805D6AD@smtp.us2.messagingengine.com> <1052903363.824.22.camel@localhost> <20030514232540.DB72C5C81C@smtp.us2.messagingengine.com>

On Thu, 2003-05-15 at 09:25, Charles Wilson wrote:

>  (Naturally,
> there are different mechanisms for key lookup if the keys are stored in a
> tree structure, which don't involve '<', but that's another topic)

Hmm. Trees depend on '<'. Its the minimum needed to build a tree
structure. Tries don't depend on '<', but they do depend on a bit-string
as a key.

> > > But that kills packing (e.g. a key_t object is no longer '72 bits of id
> > > data') because key_t now has vtables, ctor, dtor, etc etc. 
> > 
> > No, none of the above apply.
> 
> Correct.  Unless I'm right about '<'

No - vtables, ctor and dtor don't apply *at all* unless we deliberately
use those features. And operator < doesn't require any of vtables, ctor
or dtor.

> > Aliasing isn't bad. However: we *must* prevent clashes. Probability has
> > nothing to do with it. I can't comment on the Linux implementation: I
> > haven't reviewed it.
> 
> Why "must" we? 

Because if we have a conflict, and we don't resolve it do deliver a
unique key to the client, then a client may ask for a shm segment X, and
get Y instead. Limits to the number of keys are less destructive than
having a shm area that *cannot* be accessed because of a collision.

>  I agree that clashes are bad -- but "guaranteed unique
> keys"?  Does SuS2 really require that?  Because, if you give me a finite
> key bitlength, I can always create a scenario that breaks the
> "guarantee".  I do not think it is possible to always eliminate the
> possibility of clashing -- ALL you can ever do is make it as unlikely an
> event as possible.

You can make it impossible for a collision to occur. My psuedo code does
exactly that.

> I don't like this.  Without the full implementation, I'm just whistling
> in the dark here -- but there are a lot of gotchas in this code.  All of
> them can be coded around, within the Tree/Trie implementation.  But more
> code == more complexity.

What gotchas? Limits to number of keys - sure. What other ones?

> * very low: you make an extremely unlikely event even more unlikely --
> but still not impossible.  I *can* construct a scenario in which this
> tree-based thing fails.

Please, do so. AFAIK such a scenario doesn't exist within the bounds of
valid (according to SuSv2) parameters to ftok().

> Is the value worth the cost (complexity)?
> 
> I don't think so, but that's just my opinion.

Well, I've also put forward my opinion. Until I get time to dig up
SUSv2's reference... 

> > Should give a reasonably performing implementation with space for
> > 16.7Million ftok() calls on unique paths. It will use space linear to
> > the path length of each unique path presented to ftok().
> > 
> > This is what I meant by a lookaside table.
> 
> See, to me this is a tree-based name hash.  You say 'lookaside' and I
> think L2 cache... <g> 

Firstly, neither a trie or a tree is a hash. hash's have collisions and
approaches to resolving that. Storing a hash in a user-visible value
means that the user has to resolve collisions: but they can't, a key is
opaque to the user.

Secondly, this is one implementation of a lookaside table. It's not the
only possible one - you can do a lookaside table using a hash if you
want to. The point is that the key handed out is *not* dependent on the
node in in the table, but on data within that node.

Finally, it's different from a cache: We're not optimising performance
for data available elsewhere, we're creating and  storing unique
associations.

Rob
-- 
GPG key available at: <http://users.bigpond.net.au/robertc/keys.txt>.

Attachment: signature.asc
Description: This is a digitally signed message part

Follow-Ups:
- Re: key64_t? ino64_t?
  - From: Charles Wilson

References:
- Re: key64_t? ino64_t?
  - From: Charles Wilson
- Re: key64_t? ino64_t?
  - From: Robert Collins
- Re: key64_t? ino64_t?
  - From: Charles Wilson
- Re: key64_t? ino64_t?
  - From: Robert Collins
- Re: key64_t? ino64_t?
  - From: Charles Wilson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]