This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: More about charsets


On Mar 27 16:11, Andy Koppe wrote:
> Corinna Vinschen:
> > while looking into the GB18030 issue once again, I found that we still
> > may have two holes which might be important to support.
> >
> > - GB2312 aka EUC-CN
> >
> > ?We already support GBK, codepage 936. ?GB2312/EUC-CN is a subset
> > ?of GBK and apparently GBK is often used while still labeled as
> > ?GB2312. ?See the discussion here:
> > ?http://www.mail-archive.com/unicode@unicode.org/msg03516.html
> >
> > ?So the question is, should we just allow GB2312 and EUC-CN as
> > ?codeset names, but use the GBK conversion functions for them?
> 
> Might as well. As you saw, mintty already does that. Thomas Wolff's
> mined goes even further and handles both GB2312 and GBK with its
> GB18030 codec, because GBK is a subset of GB18030.

I think I'll opt for GBK for now, given that GB18030 doesn't exist yet.

> > ?Otherwise, there's also a codepage 51936, which is called EUC-CN
> > ?in the list at
> > ?http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx
> > ?I didn't test it, but it appears to be the real GB2312. ?I don't
> > ?know if it really makes sense to make the difference, though.
> 
> Also, it isn't available on any Windows I've tried.
> 
> 
> > - EUC-TW
> >
> > ?There's a codepage 51950 which appears to be something like EUC-TW.
> > ?I just found this, though:
> > ?http://code.google.com/p/mintty/source/detail?r=738
> >
> > ?Andy, is that a general rule? ?Or did you test on XP and the codepage
> > ?was just not installed, by any chance?
> 
> It doesn't show up as an option on XP, and I've just tried it again on
> Windows 7, where codepages are no longer optional. Doesn't work. I
> think I'd read somewhere that 51950 is only available for .Net
> programs, but unfortunately I can't find that again. I guess it's
> possible that Chinese Windows versions do support it anyway, although
> Wikipedia describes EUC-TW as "rarely used".

If only the MSDN documentation would tell us in which environment
which codepage exists and is usable...

The term "rarely used" is quite fortunate for us.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]