Index: faq-using.xml =================================================================== RCS file: /cvs/src/src/winsup/doc/faq-using.xml,v retrieving revision 1.28 diff -u -r1.28 faq-using.xml --- faq-using.xml 15 Jan 2010 21:41:47 -0000 1.28 +++ faq-using.xml 24 Jan 2010 21:19:34 -0000 @@ -362,13 +362,8 @@ section Internationalization -To get UTF-8 support you must set the LANG, LC_ALL, or LC_CTYPE -environment variables. To get UTF-8 support you can set, for instance, -$LANG to "en_US.UTF-8". This will give you support for the UTF-8 character -set. Note that the language part has to contain a valid language specifier, -but is otherwise so far ignored. There's no support for correct -language-specific collation, monetary or date/time-related -string handling. This is planned for a later release, though. + Cygwin uses UTF-8 by default. To use a different character set, you +need to set the LC_ALL, LC_CTYPE or LANG environment variables. To type international characters (£äö) in bash, check if the following settings are available in @@ -400,10 +395,10 @@ My application prints international characters but I only see gray boxes -Very likely you didn't set your Console character set to the preferred +Very likely you didn't set your console character set to the preferred character set before the first Cygwin application was started in the console. To make sure the console is using the desired character set, -maile sure that one of the internationalization environment variables +make sure that one of the internationalization environment variables LC_ALL, LC_CTYPE, or LANG is set before the first Cygwin process starts. You can do that, for instance, by setting the variable in your Cygwin.bat file from which you start your Cygwin shell. Index: new-features.sgml =================================================================== RCS file: /cvs/src/src/winsup/doc/new-features.sgml,v retrieving revision 1.25 diff -u -r1.25 new-features.sgml --- new-features.sgml 24 Jan 2010 15:08:01 -0000 1.25 +++ new-features.sgml 24 Jan 2010 21:19:35 -0000 @@ -10,15 +10,15 @@ Cygwin now handles locales using the underlying Windows locale - support. The locale must exists in Windows to be recognized. + support. The locale must exist in Windows to be recognized. - New tool "getlocale" to fetch valid locale values from Windows. + New tool "getlocale" to fetch valid locale identifiers from Windows. - Default charset for locales without explicit charset is now choosen + Default charset for locales without explicit charset is now chosen from a list of Linux-compatible charsets. @@ -32,7 +32,7 @@ Default charset in the "C" or "POSIX" locale has been changed back - from UTF-8 to ASCII, to circumvent problems with applications + from UTF-8 to ASCII, to avoid problems with applications expecting a singlebyte charset in the "C"/"POSIX" locale. Still use UTF-8 internally for filename conversion in this case. @@ -50,6 +50,10 @@ New strfmon(3) call. + + The console's backspace keycode can be changed using 'stty erase'. + + @@ -110,7 +114,7 @@ character will be converted to a sequence Ctrl-X + UTF-8 representation of the character. This allows to access all files, even those not having a valid representation of their filename in the current character -set (codepage). To always have a valid string, use the UTF-8 charset by +set. To always have a valid string, use the UTF-8 charset by setting the environment variable $LANG, $LC_ALL, or $LC_CTYPE to a valid POSIX value, for instance in Cygwin.bat like this: @@ -159,8 +163,8 @@ File names are case sensitive if the OS and the underlying file system supports it. Works on NTFS and NFS. Does not work on FAT and Samba -shares. Requires to change a registry key (see the user's guide). Can -be switched off on a per-mount base. +shares. Requires to change a registry key (see the User's Guide). Can +be switched off on a per-mount basis. @@ -302,7 +306,7 @@ -IPv6 support. New API getaddrinfo, getnameinfo, freeaddrinfo, +IPv6 support. New APIs getaddrinfo, getnameinfo, freeaddrinfo, gai_strerror, in6addr_any, in6addr_loopback. On IPv6-less systems, replacement functions are available for IPv4. On systems with IPv6 enabled, the underlying WinSock functions are used. While I tried hard @@ -410,8 +414,7 @@ The setting of the environment variables $LANG, $LC_ALL or $LC_CTYPE will be used. For instance, setting $LANG to "de_DE.ISO-8859-15" before starting a Cygwin session will use the ISO-8859-15 character set in the -entire session. The default charset is "UTF-8", even in the default -locale "C". The default locale in the absence of one of the +entire session. The default locale in the absence of one of the aforementioned environment variables is "C.UTF-8". @@ -420,9 +423,7 @@ in 1-16, except 12, "UTF-8", Windows codepages "CPxxx", with xxx in (437, 720, 737, 775, 850, 852, 855, 857, 858, 862, 866, 874, 1125, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258), "KOI8-R", "KOI8-U", -"SJIS", "GBK", "eucJP", "eucKR", and "Big5". The leading language and -territory part (en_US, for instance) is not used by Cygwin yet, but is -required for POSIX compatibility. +"SJIS", "GBK", "eucJP", "eucKR", and "Big5". Index: pathnames.sgml =================================================================== RCS file: /cvs/src/src/winsup/doc/pathnames.sgml,v retrieving revision 1.50 diff -u -r1.50 pathnames.sgml --- pathnames.sgml 11 Jan 2010 18:00:14 -0000 1.50 +++ pathnames.sgml 24 Jan 2010 21:19:35 -0000 @@ -403,22 +403,12 @@ Filenames with unusual (foreign) characters - Windows filesystems use the Unicode character set in the UTF-16 -encoding to store filename information. If you don't use the UTF-8 + Windows filesystems use Unicode encoded as UTF-16 +to store filename information. If you don't use the UTF-8 character set (see ) then there's a chance that a filename is using one or more characters which have no representation in the character set you're using. -For instance, there are no Chinese characters in the ISO-8859-1 -character set. So, converting a filename containing a Chinese character -to ISO-8859-1 leaves you with a wrongly converted filename, for instance, -containing a question mark '?' as replacement for the unconvertable -character. When trying to access the file, Cygwin has to convert the -filename back to UTF-16. However, this doesn't result in the original -filename because the question mark will not translate back to the original -Chinese character, but to a simple question mark instead. This in turn -results in strange "File not found" messages. - In the default "C" locale, Cygwin creates filenames using the UTF-8 charset. This will always result in some valid filename by default, but again might impose problems when switching to a non-"C"