This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: bash-3.1-7 BUG


Christopher Faylor <cgf-no-personal-reply-please <at> cygwin.com> writes:

> Is bash assuming that it can read N characters and then subtract M
> characters from the current position to get back to the beginning of a
> line?  If so, hmm.  I guess this explains why it was reading a byte at a
> time before.  It must be counting characters rather than calling lseek
> to figure out where it is.

Yes, indeed, and it seems like reasonable semantics to expect as well 
(nevermind that it means that text mode on a seekable file involves a lot more 
processing, to consistently present the user with character count instead of 
byte offset).  When a file is seekable, bash reads a buffer at a time for 
speed, but then must reseek to the offset where it last processed input before 
invoking any subprocesses, since POSIX requires that seekable files be left in 
a consistent state when swapping between multiple handles to the same 
underlying file description (even if the multiple handles exist in separate 
processes).  When using stdio (such as fread and fseek), this works due to code 
in newlib (see __SCLE in stdio.h).  But bash uses low-level Unix I/O, and does 
not benefit from newlib's approach.  In a binary mount, seeking backwards by 
the character offset from where bash has processed to the end of the buffer it 
has read just works.  It is only in a text mount where having lseek report the 
binary offset within the file, rather than the character offset, is causing 
problems.  So I will probably end up reinstating a form of the previous #ifdef 
__CYGWIN__ check for is_seekable in bash 3.1-8 to chek whether a file is in 
text mode, in which case it is non-seekable; that is certainly a faster 
solution than waiting for cygwin to make a change for lseek on a text file to 
consistently use a character offset.  But I intend that on binary files, \r\n 
line endings will treat the \r as part of the line, so at least binary mounts 
won't suffer from the speed impact of treating a file as unseekable the way 
bash 3.1-6 does.

-- 
Eric Blake



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]