This is the mail archive of the cygwin@sources.redhat.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: non latin file names?


At 10:52 10/23/00 +0400, Andrej Borsenkow wrote:
>It is not a "cygwin" - it is `ls' command, that replaces non-printable
>characters with `?'.

very true... ls asks the libc to render an 8bit character, and
libc doesn't know how to convert this code point so it punts
with the universal substitution char... like I said, I understand
that part... it was why md5sums couldn't find a given file name
out of the sums file that was confusing me.

 > Which file, sorry? File, that you have copied from NT to Lnux (onto
>SAMBA-exported partition)?

both. ;) the sums file from linux contains a filename that
md5sums on nt can't locate... and vice versa. That's how I
originally got into this, was cross checking files from a
very hacked together solution for moving a HUGE amount of
data... it involved dd, bzip2, plip, and bsd pipes... very
ugly and we wanted to sanity check things after the fact.
By now I've compared the md5sums by hand... ugh :p

 > O.K., you have (probably) two distinct problems here:

>Problem 1 - SAMBA and 8-bit characters.
>
>You must tell SAMBA what OEM code page is used by your client. This is
>probably either 850 or 437. You better ask on SAMBA list about this problem.

most likely this is the problem, I've done some more hacking (trying
to answer your previous questions with a step by step demo you
could run yourself to do a bare minimum recreation) and I see that
whatever error is happening on the xfr from windows to linux via
samba it's orthogonal... gigo... such that even though the filename
becomes garbage on linux, it's at least consistently able to spit
it back to me correctly on NT. I'll follow up with a samba guru I
know at work. The root problem in this case was that, while the two
files appeared to have the same file name, with the same glyphs,
at a binary level they didn't match... cygwin on nt wanted to use
0xF3, but the file that came over from linux had 0xA2.

>Problem 2 - locale support in Cygwin
>
>Cygwin does not have any locale support at all. There is stub implementation
>for setlocale that basically sets locale to C. Two possible implementations
>are:
>
>- use own locale database (basically, reimplement standard glibc locale
>support)
>- rely on Windows locale support if possible.
>
>I prefer the second.

I'd vote for the first, namely a port of gconv... the GNU impl of
iconv... clean and posix compliant as opposed to whatever m$ came
up with... not that I'm too familiar with m$'s "solution", just
making a prediction based on past experience. (iconv otoh I am
familiar with... it rocks.)

>Of course, either needs somebody to implement :-)

oh so much code to write... so little time... what joy it is
to be a geek. :>


--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]