This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: Grepping Unicode files?

From: "Nellis, Kenneth" <Kenneth dot Nellis at xerox dot com>
To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
Date: Thu, 14 May 2015 16:13:12 +0000
Subject: RE: Grepping Unicode files?
Authentication-results: sourceware.org; auth=none
References: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE at solidrocksystems dot com>

> Does Cygwinâs grep support Unicode files? The output from a SQL Server SQL
> Agent job is a Unicode file, i.e. if you look at it in a hex editor every
> other character is 00 because each character is taking up two bytes. The
> filename itself is fine, itâs the contents that is Unicode. I canât get
> grep to work on it, either with or without -a.
> 
> This may not be a Cygwin-specific question, but I havenât been able to
> find anything after several Google searches, including the archives, and
> neither --help nor the man page for grep references Unicode.
> 
> By default I have neither LC_ALL nor LC_COLLATE set.
> 
> A pointer to a better search or a website that explains this would be
> great, or if it canât currently be done, thatâs OK, too.
> 
> Thanks for your help!

If you don't have iconv, install the libiconv package.

Then, if what your searching for is in the ascii character set,
then the following should work:

iconv -f utf16 -t utf8 {your file} | grep {your RE}

--Ken Nellis

References:
- Grepping Unicode files?
  - From: Vince Rice

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]