This is the mail archive of the
mailing list for the Cygwin project.
Re: Unicode width data inconsistent/outdated
Am 05.08.2017 um 22:24 schrieb Brian Inglis:
On 2017-08-05 13:06, Thomas Wolff wrote:
Which other platforms do actually use newlib?
Many historical uPs and current uCs used in embedded systems supporting gcc not
using Linux, including RTEMS, devKits for Nintendo and Sony game systems, aome
Android, Google NaCl.
Do they all handle wchar_t to be encoded locale-specifically? I doubt that.
particularly points out Solaris and FreeBSD, no others.
So maybe the conversion can call jisx0201_to_ucs4 etc. from there, and
also the back-conversion for towupper/lower is available.
But then the stuff is still broken for the other reasons. I could map
the _l functions properly, if that's really desired, but how to handle
other encodings and on which platforms?
Issue 3 is the special conversion jp2uc which seems to be half-bred;
there is no such handling for Chinese or Korean.
This shouldn't matter to you, just keep it in place. It's a historical, low
footprint conversion for japanese characters without pulling in the unicode
stuff. Not used on Cygwin so just ignore.
I had noticed meanwhile that this is not active in Cygwin, but it's broken
anyway for multiple reasons:
* platforms for which wchar_t is not Unicode should be explicitly listed
* if used, the transformation needs to be applied to all non-Unicode locales
(also Chinese, Korean, and even 8-bit locales such as *.CP1252)
* for towupper and towlower, the result must be back-transformed into the
respective locale encoding
* particulary the locale-specific _l functions inconsistently do not use the
transformation but have this note:
We're using a locale-independent representation of upper/lower case based
on Unicode data. Thus, the locale doesn't matter.
So I'd suggest to drop that stuff unless someone would like to fix it.
Looks like JIS support is under newlib/iconvdata
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple