This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: cygwin.dll: bug with select on Windows console


Sorry about forgetting the testcase (which was good, actually, that
version had a bug).  I've attached it here.  This takes a single
argument, the timeout value for select() in microseconds.  A value of
1000000 is fine for testing this issue.  Type at it, or just bang on the
keyboard with random keystrokes, and it will report the results from
select() and the characters read.  On a Windows console, it should
report an error within a few seconds.  Moving the mouse cursor over the
Windows console window while typing seems to provoke the error as well. 
On mintty or any other pty, the error never occurs.

I've been unable to build Cygwin from Git on my 32-bit install on Win7,
thanks to various header mismatches and missing packages-- I think the
latest was a mingw-gcc header, but I can't check that machine now.  The
FAQ entry on how to build Cygwin is quite incomplete.  Are there any
better directions for building Cygwin?

regards,

  --jh

On 01/07/2016 12:30, Corinna Vinschen wrote:
> Hi John,
>
> On Dec 27 20:49, john hood wrote:
>> Hi all,
>>
>> I'm one of the Mosh maintainers.  Recently a user reported a problem
>> where Mosh exits suddenly soon after startup while he is typing at it,
>> see <https://github.com/mobile-shell/mosh/issues/705>.
>>
>> The problem turns out to be that occasionally, select() times out,
>> returns 0 as it should, but does *not* clear the fd_sets() passed in--
>> they are left unaltered.  Mosh doesn't pay any attention to the count of
>> ready fds and relies on the returned fd_sets being accurate.  Mosh sets
>> the read fd_set and the error fd_set, and when it encounters this
>> situation, it exits quietly when an error is found on one of the files
>> involved (rather poor error handling on our part).
>>
>> This only seems to happen on Windows Console.  select() seems to operate
>> reliably when used on a pty, whether from mintty, xterm, or sshd.
>>
>> My read of the POSIX standard is that select() should always set the
>> fd_sets on a successful return (rv >= 0).  There is a bit of ambiguity
>> around this point, but given that Cygwin is inconsistent with itself (on
>> ptys) and with every other Unix platform, I think it's a bug.
>>
>> I've attached a little demo program for the bug.  Compile, run as
>> "socket-t 1000" (the argument is the number of microseconds select()
>> should wait), and mash keys on the keyboard for a little while.  It
>> should report errors within 100 keystrokes.  I think there might be a
>> dependency on the length of the wait passed to select(), I don't see the
>> problem happening with the wait set to 100 seconds.
> Unfortunately you missed to attach the testcase.  I checked Cygwin's
> select and I can see what might be the reason for what you're observing.
> I might even have a fix.  But what bugs me is that I don't see how this
> condition could be met at all, especially not in case of selecting on
> the console.  Your testcase would be greatly appreciated.
>
> For further discussion I attached the patch I'm proposing.  AFAICS,
> copying the content of r,w,e over to readfds,writefds,exceptfds doesn't
> really make sense, unless r,w,e are still unchanged (i.e. zeroed out).
> However, *if* r,w,e have set bits, sel.wait should have returned
> select_ok, not select_set_zero.  I don't see how this might occur.
> That's what your testcase hopefully helps to uncover.
>
> Another hiccup with copying r,w,e to readfds,writefds,exceptfds and then
> calling sel.poll is this:  The number of set bits is *only* determined
> by sel.poll.  If there are any other bits set in r,w,e, the result of
> sel.poll might not reflect this and the return value is not equal to the
> number of bits set in the records.  Obviously this is not a problem in
> your case since you don't check the return value, but other applications
> might.
>
>
> Thanks,
> Corinna
>
>
> diff --git a/winsup/cygwin/select.cc b/winsup/cygwin/select.cc
> index 524e578..647c758 100644
> --- a/winsup/cygwin/select.cc
> +++ b/winsup/cygwin/select.cc
> @@ -77,8 +77,6 @@ details. */
>    (fd_set *) __res; \
>  })
>  
> -#define copyfd_set(to, from, n) memcpy (to, from, sizeof_fd_set (n));
> -
>  #define set_handle_or_return_if_not_open(h, s) \
>    h = (s)->fh->get_io_handle_cyg (); \
>    if (cygheap->fdtab.not_open ((s)->fd)) \
> @@ -132,6 +130,7 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
>  	DWORD ms)
>  {
>    int res = select_stuff::select_loop;
> +  int ret = 0;
>  
>    /* Record the current time for later use. */
>    LONGLONG start_time = gtod.msecs ();
> @@ -187,14 +186,15 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
>        select_printf ("res %d", res);
>        if (res >= 0)
>  	{
> -	  copyfd_set (readfds, r, maxfds);
> -	  copyfd_set (writefds, w, maxfds);
> -	  copyfd_set (exceptfds, e, maxfds);
> -	  if (res == select_stuff::select_set_zero)
> -	    res = 0;
> -	  else
> -	    /* Set the bit mask from sel records */
> -	    res = sel.poll (readfds, writefds, exceptfds) ?: select_stuff::select_loop;
> +	  UNIX_FD_ZERO (readfds, maxfds);
> +	  UNIX_FD_ZERO (writefds, maxfds);
> +	  UNIX_FD_ZERO (writefds, maxfds);
> +	  /* Set bit mask from sel records, even in case of a timeout.  This
> +	     also sets ret to the right value >= 0, matching the number of
> +	     bits set in the fds records. */
> +	  ret = sel.poll (readfds, writefds, exceptfds);
> +	  if (!ret && res != select_stuff::select_set_zero)
> +	    res = select_stuff::select_loop;
>  	}
>        /* Always clean up everything here.  If we're looping then build it
>  	 all up again.  */
> @@ -219,9 +219,9 @@ select (int maxfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
>  	}
>      }
>  
> -  if (res < -1)
> -    res = -1;
> -  return res;
> +  if (res < 0)
> +    ret = -1;
> +  return ret;
>  }
>  
>  extern "C" int
>

#include <err.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <termios.h>
#include <sys/select.h>

int
main(int ac, char **av)
{
  if (ac != 2) return 1;
  int time;
  time = atoi(av[1]);
  if (setvbuf(stdin, NULL, _IONBF, 0) != 0) {
    err(1, "setvbuf");
  }
  struct termios o, n;
  tcgetattr(0, &o);
  tcgetattr(0, &n);
  cfmakeraw(&n);
  tcsetattr(0, TCSADRAIN, &n);

  while (1) {
    while (1) {
      fd_set reads, errors;
      FD_ZERO(&reads);
      FD_ZERO(&errors);
      FD_SET(0, &reads);
      FD_SET(0, &errors);
      struct timeval tv;
      tv.tv_sec = time / 1000000;
      tv.tv_usec = time % 1000000;
      printf("select: ");
      int rv = select(1, &reads, NULL, &errors, &tv);
      printf("rv = %d, errno = %s\r\n", rv, strerror(errno));
      if (rv == 0) {
	if (FD_ISSET(0, &reads)) printf ("unexpected read\r\n");
	if (FD_ISSET(0, &errors)) printf ("unexpected error\r\n");
	if (FD_ISSET(0, &reads) || FD_ISSET(0, &errors)) break;
	// otherwise continue
      } else if (rv == 1) {
	if (FD_ISSET(0, &reads)) {
	// select found something normally, go read it
	  break;
	} else if (FD_ISSET(0, &errors)) {
	  printf("error on stdin\r\n");
	  break;
	} else {
	  printf("no ready fd found\r\n");
	  break;
	}
      } else if (rv > 1) {
	printf("unexpected count = %d\r\n", rv);
	break;
      } else {
	printf("rv %d, error %s\r\n", rv, strerror(errno));
	break;
      }
    }
    printf("reading char: ");
    char c = getc(stdin);
    if (c == 0x7f) break;
    printf("got 0x%02x\r\n", c);
  }
  tcsetattr(0, TCSADRAIN, &o);
  return 0;
}

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]