This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: Console codepage setting via chcp?
2009/9/25 Corinna Vinschen:
>> - System objects will always be translated using UTF-8. This includes
>> file names, user names, and initial environment variables (and
>> probably more I'm not aware of).
>
> More than 10 minutes later I'm still thinking that this is the best
> solution in the long run. ÂThere will be no situation in which any
> process running on the system has a different idea of a system object
> than any other process. ÂThat could also help to avoid interoperability
> issues in client/server applications.
Yes, there's a lot to be said for keeping such complications to a
minimum. Here are some further deliberations on the topic:
http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html#utf8
The downside, of course, is that non-ASCII filenames created in a
non-UTF8 locale won't show up correctly in Windows, and vice versa.
But that's the same on Linux if the global setting is UTF-8 while the
terminal is set to something else. And the stock answer to any
complaints will be: Use UTF-8!
In any case, the DCxx scheme will ensure that things work correctly
within any particular locale.
And I guess the ^N scheme can go (or be disabled)?
>> - The "C" locale's charset will be UTF-8.
>Yes.
>> - There'll be language-neutral "C.<charset>" locales.
>Yes.
>> - The user's ANSI codepage will remain the default charset for
>> "language_TERRITORY" locales.
>Yes.
Thanks, this gives me something to work with for mintty. Luckily, due
to the everything-is-UTF-8 approach, no mingw wrapper is actually
needed after all, as it wouldn't make a difference to anything anyway.
>> - ÂThe console charset will be set according to LC_ALL/LC_CTYPE/LANG
>> when cygwin1.dll is initialised. (Or will 'setcons' be needed for
>> that?)
>
> Hmm. ÂUnsure. ÂI know that Thomas dislikes the idea and you are not
> overly convinced either. ÂOne of Thomas arguments is the non-standard
> tool necessary to switch the terminal charset. ÂI think that's not a
> valid argument. ÂThere is no standard how to switch the charset used by
> a terminal.
As far as I know, xterm, rxvt, gnome-terminal and konsole all respect
the locale variables unless a program-specific option is used.
>ÂSo, utilizing the initial setting of LC_ALL/ff. is as good
> as defaulting to UTF-8 and allowing to switch via a setcons tool.
'setcons' requires a wrapper script, whereas the variables don't
necessarily, as they can be set in the Windows environment. This would
allow programs to be invoked directly from a shortcut and still
picking up the user's setting.
Also, one of the locale variables needs to be set anyway if one wants
to use something other than the default locale.
> I have
> found an easy way to allow a setcons tool which only switches the charset
> used by Cygwin. ÂIt doesn't affect the setting in cmd, or made by chcp.
That's a good idea. I've come round to thinking that 'setcons' is
worth having in addition to the initial setting from the environment.
>> - setlocale() will have no effects beyond what's expected in Linux.
>
> Well... probably. ÂI'm not saying yes without asking a lawyer first.
:) I put that a bit too probingly, didn't I?
Andy