Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
DiscussionsAccessExcelInfoPathOutlookPowerPointPublisherWord
DirectoryUser Groups
Related Topics
Outlook ExpressInternet ExplorerWindowsMS Server ProductsMore Topics ...

MS Office Forum / Word / Conversions / October 2005

Tip: Looking for answers? Try searching our database.

What codification use word for chinese characters

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
homeologica - 25 Oct 2005 13:03 GMT
Yes. On seeing a word document. Chinese chars are written as u99999...
codes. What kind of codification are used here. I must write a program to
create a rtf document in low level using these codes. I found unicode
useless for input chinese chars.

Thanks.

Alejandro Fernandez
Bob   Buckland ?:-) - 26 Oct 2005 13:30 GMT
Hi Alejandro,

You may want to also post this question on language specifics in the Word International Features newsgroup (link below).

=========
Yes. On seeing a word document. Chinese chars are written as u99999...
codes. What kind of codification are used here. I must write a program to
create a rtf document in low level using these codes. I found unicode
useless for input chinese chars.

Thanks.

Alejandro Fernandez>>
Signature

MS Office System Products MVP
 *courtesy is not expensive and can pay big dividends8

A. Specific newsgroup/discussion group mentioned in this message:
  news://msnews.microsoft.com/microsoft.public.word.international.features
   or via browser:
  http://microsoft.com/communities/newsgroups/en-us/?dg=microsoft.public.word.inte
rnational.features


B. MS Office Community discussion/newsgroups via Web Browser
   http://microsoft.com/office/community/en-us/default.mspx
  or
   Microsoft hosted newsgroups via Outlook Express/newsreader
   news://msnews.microsoft.com

Tony Jollans - 31 Oct 2005 02:12 GMT
I'm not sure I understand the question. What do you mean you find unicode
useless. Word uses unicode characters and you're stuck with it.

Unicode characters above U+10000 (and the extended ideographs are all above
U+20000 I think) are stored as surrogate pairs of two unicode characters,
the first in  the range U+D800 to U+DBFF and the second in the range U+DC00
to U+DFFF.

I'm not familiar with the detail of the RTF format but unicode characters
above U+8000 appear as negative decimal numbers, for example \u-10240\ is
U+D800 (-10240 is Integer representation for 55296 = hex D800).

--
Enjoy,
Tony

> Yes. On seeing a word document. Chinese chars are written as u99999...
> codes. What kind of codification are used here. I must write a program to
[quoted text clipped - 4 lines]
>
> Alejandro Fernandez
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.