This is the mail archive of the
cygwin
mailing list for the Cygwin project.
RE: Grepping Unicode files?
- From: "Nellis, Kenneth" <Kenneth dot Nellis at xerox dot com>
- To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
- Date: Thu, 14 May 2015 16:13:12 +0000
- Subject: RE: Grepping Unicode files?
- Authentication-results: sourceware.org; auth=none
- References: <3C280897-291A-4A8C-8C3F-46D1D9BEFCFE at solidrocksystems dot com>
> Does Cygwinâs grep support Unicode files? The output from a SQL Server SQL
> Agent job is a Unicode file, i.e. if you look at it in a hex editor every
> other character is 00 because each character is taking up two bytes. The
> filename itself is fine, itâs the contents that is Unicode. I canât get
> grep to work on it, either with or without -a.
>
> This may not be a Cygwin-specific question, but I havenât been able to
> find anything after several Google searches, including the archives, and
> neither --help nor the man page for grep references Unicode.
>
> By default I have neither LC_ALL nor LC_COLLATE set.
>
> A pointer to a better search or a website that explains this would be
> great, or if it canât currently be done, thatâs OK, too.
>
> Thanks for your help!
If you don't have iconv, install the libiconv package.
Then, if what your searching for is in the ascii character set,
then the following should work:
iconv -f utf16 -t utf8 {your file} | grep {your RE}
--Ken Nellis