This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Signals and the such-like


[Sorry: Long email]

About a week ago I discovered a race condition in the UNIX domain
socket emulation in cygwin.  I've got a patch for this that works
(and fixes several other small problems) bar one *minor* issue and
since I'm out of ideas, I hope someone else out there has got some
advice for me (even if it's only "don't do that!").

Here goes.  I've put together a new UNIX domain handshake
protocol, but somewhere it's got to pause long enough for the
server to pick up the client's half of the protocol, since with a
socket a client can get connection, write some data and close the
socket before the server has accepted the connection (the
connection's just sitting on the pending queue).

So, I've got a piece of code in the fhandler_socket::close method
that only closes the client's secret event once the client has
received the server's okay signal *or* a (Unix) signal arrives
*or* the server closes its end of the connection (i.e. the server
exits w/o ever accepting the connection).

This is all fine and dandy except for two situations: if the
client receives an unhandled signal that should cause it to die
*or* if the client exits w/o closing the socket.  At this point,
if the server is blocked itself and not accepting the connection,
the client will not exit and can't be ctrl-c'd either.   The
problems in the two situations are caused by the same issue:

*) If the client receives an unhandled signal, e.g. SIGINT, the
do_exit function is called, which then calls close_all_files.  But
it does this w/o setting the 'signal_arrived' event, so none of
the events are set that the fhandler_socket::close method is
waiting on (at least, not in the particular circumstances
mentioned here).

*) If the client exits w/o closing the socket, again it gets stuck
in fhandler_socket::close since no events are going to be raised.

Alternatives (AFAICT):

*) Just put a timeout in the fhandler_socket::close routine (as
was effectively the case in the previous protocol).

*) In do_exit, set a global flag that the close routine can pick
up.  There is already such a flag: exit_already in "exceptions.cc"
but this is static and so inaccessible.  Or is there an existing
mechanism that I'm missing?

*) A partial solution (and one that might be worth doing
regardless of any other solution) would be to set the
'signal_arrived' event before calling the do_exit function when
dying from a signal's arrival.  I've tried this and it seems to
cause no problems, but is only a partial solution to the problem.
(Unless it's always set on exit . . . yuck?)

*) It would be okay perhaps to let the client block in this way,
*if* it could still be killed by a signal whilst blocked.  *But*
the do_exit code in "dcrt0.cc" ignores a slew of signals, so if a
process does get blocked while exiting, it can't then be (easily)
killed.  [You can still 'kill -9' it at this point.]  Has someone

*) Or am I worrying too much?  Don't worry about it much, bung in
a timeout, it'll hardly ever happen, relax?

// Conrad




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]