This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Clearing O_NONBLOCK from a pipe may lose data


On 02/20/2015 01:21 PM, Lasse Collin wrote:
> 
> For example, if xz is modified to leave O_NONBLOCK enabled on stdout,
> 
>     ( xz -dc smallfile.xz ; cat bigfile ) | ( sleep 1 ; wc -c )
> 
> will make cat print
> 
>     cat: write error: Resource temporarily unavailable
> 
> on GNU/Linux and most of bigfile won't be seen by wc. However, on
> Cygwin the above works because the modifications to the file status
> flags in xz aren't seen by cat. That is, stdout in cat is in blocking
> mode even though xz left it in non-blocking mode.

That's a bug in cygwin, although fixing it may be difficult.  In POSIX
terminology, the blocking flag of an open file description is shared
among all the file descriptors visiting that file description, whether
the extra file descriptors were created by dup() or by fork(); and it is
one of the few bits of information where a child can affect the state
visible in the parent.  But if Windows doesn't share blocking status of
multiple handles visiting the same resource, then I'm not sure how we
can emulate it.

[various side notes: Another such bit of information shared between
processes is the file offset of a shared seekable file description, and
Windows DOES support that.  On the other hand, things like the cloexec
flag are associated with a file descriptor [fcntl F_GETFD] rather than a
file description [fcntl F_GETFL], and so are NOT shared across dup() or
fork().  Also, note that two consecutive calls to open() with the same
parameters produces two separate file descriptions (the two file
descriptors do not share state); it is only dup() or fork() that can
create two file descriptors sharing a file description.

Where it gets really weird is with flock() vs. fcntl(F_SETLK)/lockf()
locking: flock() is per-description, but fcntl() and lockf() are
per-inode, which means a second locker visiting a distinct open() can
wipe out the lock of the first description - making lockf() very painful
to use; but flock() lacks byte ranges, making it also unideal.  The
Linux kernel is pioneering a new lock, and POSIX is considering
standardizing it, called fcntl(F_OFD_SETLK) which has all the benefits
of per-description locking (the best of flock) and range locking (the
best of lockf) - eventually, Cygwin should probably implement that as
well.  We already have quite the hacks in place to get flock/lockf
coordination from child back to parent, which would be the sort of code
to borrow from if we have to figure out how to get nonblocking status
propagated in the same direction.

hmm - my side notes took more space than my real response - is that a
good thing?]

> 
> The above Cygwin behavior would make it very easy to add a workaround
> to xz for the pipe-data-loss problem: simply don't clear O_NONBLOCK on
> stdout. However, I wonder if it is a bug in Cygwin that the changes to
> file status flags aren't seen via other file descriptors that refer to
> the same file description. If it is a bug, then the workaround idea will
> cause trouble in the future when the bug in Cygwin gets fixed.

Yeah, the fact that cygwin is buggy with respect to POSIX may break any
workaround you add if cygwin is later patched.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]