This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ANNOUNCEMENT] Updated [test]: sed-4.4-1

On 02/13/2017 09:53 AM, cyg Simple wrote:
> On 2/13/2017 9:14 AM, Nellis, Kenneth (Conduent) wrote:
>> From: Steven Penny  
>>> Perhaps I am missing something, but cant all that be said about Sed too? I
>>> just cant see a situation where we are justified changing one and not the
>>> other. They should either both strip carriage returns or neither.
>> How about grep?
>> $ printf 'hello\r\nworld\r\n' | grep hello | od -An -tcx1
>>    h   e   l   l   o  \n
>>   68  65  6c  6c  6f  0a
>> $
>> Are there others?
>> (BTW, I support the change.)
> All pipe handles should be binary or at least an option to make it that
> way.  The file handles should be bound to the mounted mode.

I'm in favor of reducing special cases of FORCED text mode. It's great
on text mounts, but text mounts are discouraged for a reason (slower
computing, surprising results when seeking), and I recently patched bash
to quit forcing text mode (bash 4.3.42-4).

Pipes are indeed binary mode by default (and should stay that way), so
even if you have a long pipeline chain:

cmd1 < file_in | cmd2 | cmd3 | cmd4 > file_out

if file_in and file_out are mounted on text mounts, then cmd1 won't see
any carriage returns, so neither will cmd2, cmd3, or cmd4, and finally
cmd4 writes in text mode back to file_out.

But when you are operating on a binary mount, and WANT carriage returns
to be preserved, forcing a text mount at any point in the chain corrupts
all later points in the chain.

There's a big difference between using "rt" to force text mode (which is
what I killed in this sed release), using "rb" to force binary mode
(which is what I use in tar, because tar MUST preserve binary data), and
using "r" (which is what sed now uses) to let the mount point decide
whether CR are important.

So I'd be in favor of a patch to awk dropping forced text mode on binary

And I'll look into fixing grep to quit misbehaving as well.

Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library

Attachment: signature.asc
Description: OpenPGP digital signature

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]